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The Fleming- Viot measure- valued diffusion is a Markov process de- 
scribing the evolution of (allelic) types under mutation, selection and 
random reproduction. We enrich this process by genealogical relations 
of individuals so that the random type distribution as well as the ge- 
nealogical distances in the population evolve stochastically. The state 
space of this tree- valued enrichment of the Fleming- Viot dynamics with 
mutation and selection (TFVMS) consists of marked ultrametric mea- 
sure spaces, equipped with the marked Gromov-weak topology and a 
suitable notion of polynomials as a separating algebra of test functions. 

The construction and study of the TFVMS is based on a well-posed 
martingale problem. For existence, we use approximating finite popu- 
lation models, the tree-valued Moran models, while uniqueness follows 
from duality to a function-valued process. Path properties of the result- 
ing process carry over from the neutral case due to absolute continu- 
ity, given by a new Girsanov-type theorem on marked metric measure 
spaces. 

To study the long-time behavior of the process, we use a duality 
based on ideas from Dawson and Greven [On the effects of migration 
in spatial Fleming-Viot models with selection and mutation (2011c) 
Unpublished manuscript] and prove ergodicity of the TFVMS if the 
Fleming-Viot measure- valued diffusion is ergodic. As a further applica- 
tion, we consider the case of two allelic types and additive selection. For 
small selection strength, we give an expansion of the Laplace transform 
of genealogical distances in equilibrium, which is a first step in showing 
that distances are shorter in the selective case. 
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1. Introduction. Genealogies are fundamental in studying population 
models. In this paper, we focus on the large population limit of constant size 
populations evolving under resampling, selection and mutation in a stochas- 
tic fashion. The type distribution of this limit is modeled by the Fleming- 
Viot measure- valued diffusion. Here, resampling is the random reproduction 
of individuals, mutation is the random change of (allelic) types of individ- 
uals and selection is the dependence of offspring numbers on the types. By 
defining random reproduction we obtain ancestral relations between indi- 
viduals described by a randomly evolving genealogy. In our approach, we 
model both the genealogical and the type structure in the population. 

Populations under selection are modeled either by finitely or by infinitely 
many individuals (diffusion) . An analysis of the former was carried out using 
the biased voter model by Neuhauser and Krone (1997) and Krone and 
Neuhauser (1997). The large-population limit of the type frequencies leads 
to the measure-valued Fleming-Viot dynamics; see, for example, Fleming 
and Viot (1978), Dawson (1993), Ethier and Kurtz (1993), Donnelly and 
Kurtz (1996, 1999), Dawson and Greven (1999, 2011, 2012a, 2012b). A main 
tool in the mathematical analysis of these models is historical information 
about the population in the form of genealogical relations of individuals. 

In applications, genealogies of a population sample are most important. 
In particular, mutation rate estimators are based on the average genealogi- 
cal distance or the tree length of the genealogical tree spanned by a sample 
of individuals [Watterson (1975), Tajima (1983)]. Moreover, the enrichment 
of population models by information on ancestral lines has become common 
[e.g., Kaplan, Darden and Hudson (1988), Kaplan, Hudson and Langley 
(1989)]. To cope with the modeling needs in population genetics, many ex- 
tensions and generalizations of the Fleming-Viot dynamics have been given, 
for example, the evolution under recombination [see, e.g., Dawson (1993), 
Ethier and Kurtz (1993), Donnelly and Kurtz (1996, 1999)], as well as the 
evolution of a spatially distributed population [Dawson, Greven and Vaillan- 
court (1995), Dawson and Greven (1999, 2011, 2012a, 2012b)] and general 
exchangeable modes of exchange of types [Bertoin and Le Gall (2003, 2005, 
2006)]. 

In order to understand the genealogical structure of population models, 
consider the neutral case (i.e., no selection) and a fixed time t first. Since 
the resampling mechanism is completely independent of allelic types, the 
genealogy can be constructed from the present to the past using common 
ancestors of ancestral lines. In the case of finite variance offspring distribu- 
tions [and a weak assumption on their third moments, Mohle and Sagitov 
(2001)], the result is Kingman's coalescent [Kingman (1982)]. 

As populations evolve, the underlying genealogies evolve as well. Con- 
sequently, the resampling mechanism allows one to describe genealogical 
information of individuals at all times. The main purpose of the present 
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paper is to give a new approach to studying ancestral relationships under 
selection via evolving genealogies. In particular, we extend the construction 
of the tree-valued Fleming- Viot dynamics under neutrality carried out in 
Greven, Pfaffelhuber and Winter (2012). Note that the resulting processes 
are among the first tree- valued stochastic processes in the literature [but see 
also Zambotti (2001, 2002, 2003), Evans, Pitman and Winter (2006), Evans 
and Winter (2006), Evans and Lidman (2007)]. 

The difficulty in understanding the genealogical structure of a population 
under selection already arises for fixed time genealogies. Most importantly, 
types and offspring distributions of individuals are not independent in the 
selective case. To deal with this dependence, three different approaches have 
been used. 

First, Kaplan, Darden and Hudson (1988), Kaplan, Hudson and Langley 
(1989) condition the construction of the genealogy on the allelic frequency 
path; see also Kaj and Krone (2003), Barton, Etheridge and Sturm (2004), 
Etheridge, Pfaffelhuber and Wakolbinger (2006). If the allelic frequency path 
is known, and an allelic type is present with frequency x € [0, 1] at time t, 
the rate of coalescence of two lines of this type is proportional to 1/x. This 
construction leads to valuable insights, for example, into the allelic types of 
ancestors of the population [Taylor (2007)]. 

Second, the ancestral selection graph from Neuhauser and Krone (1997) 
and Krone and Neuhauser (1997) gives a two-step procedure to derive the 
genealogy of a population sample. This construction can, for example, be 
used to see that any ancestor has a higher fitness than a randomly chosen 
individual [Fearnhead (2002)]. [Other results derived from the ancestral se- 
lection graph are, e.g., given in Fearnhead (2001), Slade (2000a, 2000b) and 
Etheridge and Griffiths (2009).] An important property of this second ap- 
proach is that the process generating the genealogy arises as a dual process 
of the measure- valued Fleming- Viot process [Mano (2009)]. A connection 
between the first two approaches has recently been found in the case of 
strong balancing selection [Wakeley and Sargsyan (2009)]. 

Third, the lookdown construction of Donnelly and Kurtz (1996) and Don- 
nelly and Kurtz (1999) establishes a particle representation of the Fleming- 
Viot process with and without selection. Genealogies can as well be read 
off from the lookdown process. In the neutral case, the lookdown construc- 
tion has, for example, been used to study the evolution of the time to the 
most recent common ancestor of the population [Pfaffelhuber and Wakol- 
binger (2006), Delmas, Dhersin and Siri-Jegousse (2010)]. In the selective 
case, hardly any properties of the genealogies have been read off from the 
lookdown process. 

In the present paper, we extend the analysis of the neutral tree-valued 
Fleming-Viot process from Greven, Pfaffelhuber and Winter (2012) to in- 
clude mutation and selection. This leads to new tree-valued processes de- 
scribing the joint evolution of the allelic type- frequencies and the underlying 
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Fig. 1. Graphical construction of a tree-valued Moran model with two types with mutation 
and selection. The fitter type is drawn by the black line and the weaker type by the gray line. 
In the left part of the figure the gray arrows are used independently of color of the involved 
lines whereas the black arrows are only used if they start from a black line. Changes of 
color along a single line are due to mutations. The right part shows how the percolation 
structure on Sn gives rise to a genealogical tree, that is, a (pseudo-)metric space on the set 
of leaves. The leaves of the tree are marked by the types of the corresponding individuals. 

genealogy. We encode random genealogies (trees) as random metric spaces; 
see Evans (2000) for the first paper in this direction. In our construction, the 
genealogies evolve forward in time, but contain historical information about 
the population. Allelic types are encoded by marks attached to elements of 
the metric space. 

The starting point of our investigation is the continuous-time Moran 
model with mutation and selection. This is a model of a population of 
finitely many (distinct) individuals evolving under resampling, mutation 
and selection and is best studied by its graphical representation. At any 
fixed time, this representation generates a genealogical tree marked with 
types; see also Figure 1. In a straightforward way, this allows us to intro- 
duce dynamics of genealogies with marks (types) as piecewise deterministic 
Markov process with jumps. We show that the large population limit of 
this collection of tree-valued Markov processes exists and is the unique so- 
lution of a martingale problem (Theorems 1 and 3). The resulting process is 
an enrichment of the measure-valued process and we call it the tree-valued 
Fleming-Viot process with mutation and selection {TFVMS). On the way, 
we develop the stochastic analysis for tree-valued processes. In particular, 
we give a Girsanov-transform for our processes and show that genealogies 
with and without selection can be studied using a change of measure (The- 
orem 2). 
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We continue by showing that the function-valued dual for the Fleming- 
Viot process [see, e.g., Dawson (1993)] works in the tree-valued setting. 
Using this duality and ideas from Dawson and Greven (2011, 2012a, 2012b), 
we obtain a stochastic representation for the expectation of functionals of 
sampled finite marked subtrees. As an application we establish the long-time 
behavior and the ergodicity of the TFVMS (Theorem 4), if the measure- 
valued Fleming-Viot process is ergodic. We use this equilibrium to study 
an important quantity in empirical population genetics in the case of two 
allelic types and additive selection: the genealogical distance of two randomly 
sampled individuals of the population. We compute the Laplace transform 
of the genealogical distance of two sampled individuals in the case where 
the selection coefficient is small (Theorem 5). This result suggests that tree- 
lengths are shorter under additive selection. This assertion is widely believed 
to be true among biologists, but has never been proved. 

Our construction gives a process on the space of marked trees, which we 
can treat as marked metric measure spaces. For convenience, we choose the 
space of types to be a compact metric space. For the construction, we re- 
quire knowledge of fundamental topological properties of the marked metric 
measure spaces. While the case without marks is treated in Greven, Pfaffel- 
huber and Winter (2009), topological properties for the case with marks are 
developed in Depperschmidt, Greven and Pfaffelhuber (2011). 

2. Moran models with mutation and selection. In this section, we first 
describe a version of the Moran model with mutation and selection (Sec- 
tion 2.1), its graphical construction (Section 2.2) and then extend the de- 
scription to the tree- valued case (Section 2.3). Finally, we discuss various 
aspects of models including selection (Section 2.4). 

2.1. The dynamics of the Moran model. Fix G N, the population size 
of the Moran model. Every individual carries an (allelic) type, element of a 
set /, and we assume that 

(2.1) / is a compact metric space 

for convenience. The individuals of the population are denoted by A;, /, . . . € 
{1, . . . , A^}. The initial configuration is (mi(0), . . . ,UAr(0)), where Ufc(O) G / 
denotes the initial type of individual k. The population evolves as a pure 
jump Markov process, and the dynamics are given through the following 
mechanisms. 

► Resampling (also known as pure genetic drift): every (unordered) pair 
ky^l is replaced at the resampling rate 



(2.2) 
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Upon such a resampling event, I is replaced by an offspring of A;, or A; is 
replaced by an offspring of I, each with probability ^. In other words, for 
every ordered pair k ^ I, individual I is replaced by an offspring of k at 
rate ^. 

► Mutation: the type of every individual changes from n to u at rate 

(2.3) ^■f3{u,dv), 

where ■!? > (the mutation rate) and /?(•, •) is a stochastic kernel on /. 

For selection, we have two different cases. (See also the discussion in Sec- 
tion 2.4 on other forms of selection.) Individuals are either haploid or diploid. 

► Haploid selection: every (ordered) pair A; 7^ / is involved in a selection 
event at rate 

(2.4) f-x(n.) 

for a > (the selection coefficient) and measurable fitness function x - 1 ^ 
[0,1]. Upon a selective event, individual / is replaced by an offspring of 
individual k. 

► Diploid selection: every (ordered) triple of pairwise distinct k,l,m is 
involved in a selection event at rate 

(2.5) jj^ ■x'{uk,Um) 

for a > and a symmetric [0, l]-valued function x' with x'{'^^'^) = x'i^^u), 
which denotes the fitness of the diploid {n, v}. Again, individual / is replaced 
by an offspring of individual k. 

Remark 2.1 (Diploid selection). While the mechanism for haploid se- 
lection is intuitively clear, the diploid case requires some explanation. Here, 
N is the number of haploid individuals, which are arranged in pairs to form 
diploids. Since the formation of diploids according to the type frequencies of 
the haploids acts on a fast timescale, we can assume that the population is 
in Hardy-Weinberg equilibrium at all times, meaning that the diploid indi- 
viduals are random pairs of haploids, and this formation is independent for 
all times. 

Actually, to model diploid selection, we would have to say that every 
quadruple k,l,m,n of pairwise distinct individuals is involved in a selec- 
tive event at rate a ■ x'{uk,Um)/N'^ in which the haploid / from the diploid 
individual {l,n} is replaced by an offspring of haploid k from the diploid 
individual {k,m}. However, as the haploid individual n is not affected by 
such events, our definition above is appropriate. 

Haploid and diploid selection leads to the same dynamics in special cases. 
In the large population limit, we see that diploid selection reduces to the 
haploid case for additive fitness, that is, if x' is of the form x'i'^j ^) = x('") + 
xiv) for some function x; see (3.20) and (3.23). 
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2.2. The graphical construction. A useful construction of the Moran mod- 
el is by means of a random graph whose main benefit is to automatically 
generate ancestral lines explicitly. For instance, we use these ancestral lines 
in order to bound the number of ancestors of the whole population (Propo- 
sition 6.9) and show tightness of a sequence of tree-valued Moran models 
(see the proof of Theorem 3). 

Definition 2.2 (Graphical construction of the Moran model). For fixed 
iV € N, set 

Un = {1,...,N}, 

and consider the following families of independent Poisson point processes: 
^res := {Vrii ■k,leUN} each rfris with rate ^, 
^mut {n'Lnt'-k^UN} each 

^mut with rate 

and 

haploid selection: r/sci := -^,1 €^Un} each ?7^^| with rate — , 

diploid selection: := {Vscl"^ :k,l,mG Un} each r]^^{"^ with rate j^. 

The graphical construction of the particle system defines a percolation struc- 
ture on the set Sn := Un x [0,oo). If t G tjrci, we draw an arrow from {k,t) 
to {l,t). If i € ?7g^| in the haploid case, or i € %ci"^ ™ diploid case, draw 
a selective arrow from {k,t) to {l,t) in the haploid case and two different 
selective arrows from {k,t) to {l,t) and from {m,t) to {l,t). 

Finally, consider the type process {uk{t))keUN,t>o^ starting in ni(0),..., 
tiAr(O). Upon a resampling event t G rjiei, set ui{t) = Uk{t—). In addition, we 
say that {k,t—) is the ancestor of {l,t) at time t—. For t E r/^^|, a selective 
event takes place with probability xi^kit—)) in the haploid case. In this case 
we set ui{t) = Uk{t—) and say that {k,t—) is the ancestor of {l,t) at time t—. 
In the diploid case a selective event t € %e\"^ takes place with probability 
x'{uk{t—),Umit—)), and we set ui{t) = Uk{t—). In this case {k, t—) is ancestor 
of {l,t) at time t—. Mutation events take place at times t € f]mut where we 
set Ufc(i) = V with probability I3{uk{t—),dv). 

Example 2.3 (Example with haploid selection and two types). The left 
part of Figure 1 illustrates the graphical construction of the Moran model 
in the special case = 5, haploid selection, I = {•, } and x = that 
is, • is fit and is unfit. Mutation from • to • and vice versa occurs at 
two possibly different rates, denoted i} • and Resampling arrows in r/res 
are drawn in gray, while selective arrows in ?7gei are black. Thus, the gray 
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arrows are always used, whereas the black arrows are only used if they start 
from black lines. 

Remark 2.4 (Convergence to the Fleming- Viot process). Consider the 
graphical construction of a Moran model of size A'^ with mutation and se- 
lection from Definition 2.2. For any t, the types ui{t), . . . ,UN{t) € / of indi- 
viduals 1, . . . , at time t can be read off. We define the A^th empirical type 
distribution process = (Ct^)t>o by 

1 ^ 

k=l 

It is well known that ("^ ^Y^±^ where (" = {Ct)t>o is the measure-valued 
Fleming- Viot process with mutation and selection; see, for example, Dawson 
(1993), Ethier and Kurtz (1993), Etheridge (2001). In Example 3.9, we recall 
its definition via a martingale problem. 

2.3. The tree-valued Moran model. We are now prepared to define the 
tree-valued stochastic process arising from the Moran model with mutation 
and selection, in terms of the graphical construction from Definition 2.2. For 
this purpose we will need the notion of ancestors. From Figure 1 it is clear 
that every I G Un at time t has an ancestor As{l, t) S Un at time s <t. 

Definition 2.5 (Tree- valued Moran model with mutation and selection). 
We use the same notation as in Definition 2.2. For every (Z,t) G Sn, define 
the C/AT-valued, piecewise constant process (As(/, t))o<s<t that jumps from k 
at time s to j at time s— , if {j,s—) is an ancestor of {k,s) at time s—. We 
then say that As{l,t) is the ancestor of {l,t) at time s. 

The tree-valued Moran model of size N with mutation and selection takes 
values in triples (C/at, r'^, /i'^), where is a pseudo- metric on Un [i.e., 
r^{k, Z) = is allowed for k I] and is a probability measure on Un x /. 

Starting in a pseudo- metric r^ on Un, we define for k, I G Un and t > 

r 2{t - sup{s : As{k, t) = A,{1, t)}), 
(2.7) rf{k,l):=l if Ao{k,t) = Ao{l,t), 

(2t + r^{Ao{k,t),Ao{l,t)), else, 

a pseudo- metric on Un, such that r^^{k,l) is twice the time to the most 
recent common ancestor of k and /. Finally, we define the sampling measure 
as 

1 ^ 

k=l 

Then the tree-valued Moran model with mutation and selection is given by 
(2.9) {{UN,rf',^^^))t>o. 
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Example 2.6 (Example with two types). Let us again consider Exam- 
ple 2.3 and Figure 1. For any time t, a genealogical tree can be read off 
for the individuals (1, t), . . . , (5, t), giving rise to a (pseudo-) metric on C/5 
based on genealogical distances. In addition, the types ui{t), . . . ,U5{t) are 
encoded in the graphical representation as well and give rise to the empirical 
measure • 

Remark 2.7 (Trees as marked metric measure spaces, mark functions). 
(1) Recall that an ultrametric space can be mapped isometrically in a unique 
way onto the set of leaves of a rooted M-tree, justifying the name tree-valued; 
see also Remark 2.2 in Greven, Pfaffelhuber and Winter (2012). 

(2) We call the states {UN,r^ , fi^) marked metric measure spaces (or 
mmm-spaces)] see also Definition 3.2. To define an appropriate notion of 
convergence, we will have to pass from {Un ^r^^ , pi-f) to equivalence classes 
(also defined in detail in Definition 3.2). Roughly speaking, {Un ,r^ , 
and {Uj\i,r'^,^!^) are equivalent, if there is a bijection a on C/tv with 
r^ {a{i),a{j)) = r'^ and ^'^ is the image of /.i^ under the reorder- 
ing cr. We will write 



for the equivalence class of {Un ,r^^ , ^^), and call = {Uj^)t>o the tree- 
valued Moran model with mutation and selection (TMMMS). 

(3) For the tree-valued Moran model, {{U]\i,r^ , iJ,^))t>o, we can define a 
mark function, Kt{k) ■.= Uk{t). Moreover, resampling/selection and mutation 
occur at different time points, which implies that nt is measurable with 
respect to the Borel-cr-algebra of {UN,rf^) for all t>0, almost surely. In 
particular, has the special form 



See Remark 3.11 for more on mark functions in the large population limit. 

2.4. Background on selection. Since fitness is the fundamental concept 
in Darwin's Origin of Species, selection is the most important feature of 
population models in biology. A vast amount of literature is devoted to this 
topic. We briefly discuss aspects related to the tree-valued processes. 

Fertility, viability and state- dependent selection. In a selective event of 
the Moran model described in Section 2.1, an individual replaces a randomly 
drawn individual, independent of the fitness of the replaced individual. Thus, 
we take the special form of fertility selection here; that is, individuals might 
have a fitness bonus which determines their chances to produce a higher 
number of offspring. Sometimes, this is also called positive selection. 



(2.10) 



= (C/;v,rf,/.f) 



(2.11) 
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In the case of viability or negative selection, individuals have a fitness 
malus, which determines their chances to die and be replaced by the offspring 
of a randomly drawn individual. In the case of viabihty selection acting on 
haploids, we would have a fitness function x:/^ [0,1], and every ordered 
pair k ^ I is involved in a selective event at rate a ■ xiui)/N. Upon such 
an event, individual I is replaced by an offspring of individual k. Our main 
results, Theorems 1-5, carry over to the situation of viability selection. 

Also the state- dependent selection can be incorporated in our model. For 
this, recall the empirical type distribution of the Moran model of size 
N from Remark 2.4. Consider the fitness function x"-^ ^ — )• [0,1], 
that is, C) is the fitness of type u if the type distribution of the total 

population is (. An offspring of individual k replaces the individual I at rate 
jf ■ x"{uk,C)- However, if 

(2.12) x"{u,C) = J X\u,v)adv) 

for some x' : x — [0, 1] we find that an offspring of individual k replaces 
individual / at selective events occurring at rate 

a a r a ^ 

(2-13) — • x"{uk,C) = Jq'j x'{uk,v)Cidv) = ^ X] x'{uk,Um)- 

m=l 

So, if (2.12) holds, (2.5) shows that state-dependent selection is the same as 
diploid selection. Compare also Section 7.6 in Etheridge (2001). 

Kin selection. For measure-valued processes, selection is modeled by a 
symmetric function x'-^^ / — )• M; see Definition 2.2. In the TMMMS we 
encode both, the type distribution and the genealogical tree in the process. 
This allows us to treat diploid selection depending also on genealogical dis- 
tance; that is, we can deal with fitness functions of the form 

(2.14) x:/x/xM+^[0,l]. 

Here, xiu,v,r) is the fitness of a diploid individual with genotype {u,v} if 
the genealogical distance of the two ha ploids forming the diploid individual 
is r. Equivalently, if u = (UN,r^ ,fi^) is the current state of the TMMMS, 
then the offspring of the haploid individual k gUn replaces individual / G Un 
at a selective event taking place at rate 

N 

(2.15) ■'^x{uk,Uni,r^{k,m)). 

m=l 

A special case of selection depending on genealogical distance is kin selection 
[e.g., Uyenoyama, Feldman and Mueller (1981)], leading to the concept of 
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inclusive fitness [Hamilton (1964a, 1964b)]. The idea is that the fitness of 
an individual is higher if close relatives are around who can help to raise 
offspring. Such an altruistic behavior can evolve since it might also be ben- 
eficial for the helpers, because offspring of close relatives is likely to carry 
similar genetic material. Such a scenario can be modeled using a fitness 
function of the form (2.15) that is decreasing in its third coordinate, that is, 
in the genealogical distance. 

The ancestral selection graph of Krone and Neuhauser. Genealogies un- 
der selection were studied in Neuhauser and Krone (1997) and Krone and 
Neuhauser (1997) by introducing the ancestral selection graph (ASG). The 
construction can easily be explained using Figure 1. Suppose that we are 
interested in the genealogy at time t. The ASG produces the genealogy in 
a three-step procedure from present to the past. Most importantly, when 
working backward in time, it is not known in advance if a selective arrow is 
used or not. 

(1) Going from the top downward through the graphical representation, 
consider first the resampling and selective arrows. Two lines coalesce when 
a resampling event occurs between them. If a line hits the tip of a selective 
arrow, a branching event occurs. One line, the continuing line, is followed 
in order to get information on the ancestral line if the selective arrow is not 
used, and the other line, the incoming line, is followed if the selective arrow 
is used. Wait until time and stop the process. 

(2) At time 0, mark all individuals according to the initial distribution, 
and superimpose the mutation process along the graph, from time to 
time t. 

(3) Go through all selective arrows between times and t. Follow the 
continuing line if the arrow does not go from a black line to a gray line, 
because in this case, the selection event is not realized. In the other cases, 
take the incoming branch. 

As a result, one obtains genealogical distances of the time t population, 
together with their types. The main difference between the ASG and our 
construction is that the ASG gives the genealogy only at a single time, while 
we describe evolving genealogies. However, our dual process in Section 5 is 
reminiscent of the ASG. 

Outline: The paper is organized as follows. In Section 3, we state our main 
results on the TFVMS process. In Sections 4 and 5 we develop some tools 
which are not only needed in the proofs of the main results, but are also 
of interest in their own right. The techniques we use are a detailed analysis 
of the generator of TFVMS (Section 4) and duality of Markov processes 
(Section 5). In Section 6 we state and prove important facts concerning the 
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Moran model. For instance, we give the generator characterization of the 
finite population model (TMMMS) and discuss properties of numbers of 
ancestors and descendants. Finally, the proofs of our main results are given 
in Sections 7 and 8. 

We collect the most important notation needed in the paper in the Appendix. 

3. Results. In this section we formulate our main results in the set-up 
of and under assumptions listed in Sections 2.1 and 2.3. Our main point 
is to establish that the weak limit of the process {{Uj\^ ,rf^ , fi^^))t>Q from 
Definition 2.5 as N ^ oo exists, characterize it intrinsically and to study its 
properties. The result is the generalization of the convergence of the measure- 
valued Moran models to the Fleming- Viot diffusion (see Remark 2.4) to the 
level of marked genealogical trees. 

Before we formulate the results, we have to specify the state space and 
give a summary of its properties in Section 3.1. Afterward, in Section 3.2, 
we give in Theorem 1 the construction of the TFVMS via a well-posed mar- 
tingale problem. Theorem 2 in Section 3.3 gives a Girsanov transformation 
between the neutral and the selective tree- valued processes, and Theorem 3 
from Section 3.4 shows that the TFVMS arises as weak limit of TMMMS. 
The long-time behavior of TFVMS is studied in Theorem 4 of Section 3.5. 
Finally, an application to genealogical distances of sampled individuals in 
equilibrium is considered in Section 3.6, in Theorem 5. 

Remark 3.1 (Notation). For product spaces X x Y x ■ ■ ■ , we denote 
by vrxjTTy, . . . the projection operators. For a Polish space E, the function 
spaces B{E) and C{E) denote the bounded measurable and bounded continu- 
ous, real- valued functions on E, respectively. We denote by J^i{E) the space 
of probability measures on (the Borel sets of) E, equipped with the topology 
of weak convergence, abbreviated by =^. For fj. G M.i{E) and (p € B{E), we 
set (/x,<?i) := / (j){x) fi{dx) . Moreover, for ip'.E^E' (for some other Polish 
space E'), the image measure of /.t under if is denoted by (^*^. For ^ C R, 
equipped with the Euclidean topology, we denote by Ce{A) (T>e{A)) the 
set of continuous (cadlag) functions E, equipped with the topology of 
uniform convergence on compact sets (the Skorohod topology). 

3.1. State space. Here we introduce the set of isometry classes of marked 
ultrametric measure spaces (denoted by U^) that will be the state space of 
both, the TMMMS and the TFVMS. The starting point of our definition 
are results from Greven, Pfaffelhuber and Winter (2009) that are extended 
in Depperschmidt, Greven and Pfaffelhuber (2011). While I is a compact 
metric space in all applications, the notions introduced in this subsection 
are valid for any Polish space /. 
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Definition 3.2 (mmm-spaces). (1) An I-marked metric measure space, 
I-mmm-space or mmm-space, for short, is a triple {X,r,fi) such that {X,r) 
is a complete and separable metric space and fi G J^i{X x I). Without loss 
of generality we assume that X C M. 

(2) An mmm-space {X,r,fi) is called compact if (supp((7rx)*M), t') is com- 
pact. It is called ultrametric if (supp((7rx)*/t^), ?') is ultrametric. 

(3) Two mmm-spaces (X, rx,^x) and {Y,rY-,^JiY) are measure-preserving 
isometric and /-preserving (or equivalent), if there exists a measurable map 
if-.X^Y such that rx{x, x') = ry ((/^(x), (^(x')) for all x, x' € supp((7rx)*^x) 
and = for (p{x,u) = {ip{x),u). The equivalence class of an mmm- 
space {X, r, fi) is denoted by {X, r, //) . 

(4) We define 

(3.1) M'^ := {{X, r, fi) : {X, r, /.i) mmm-space}. 
Moreover, 

:= {{X,r,fi) : (X, r, /x) compact mmm-space}, 

(3.2) := {{X,r,fi) : {X, r, fi) ultrametric mmm-space}, 
:=M^nU^. 

Generic elements of (U^) are denoted by ?c,ti,. . . {u, . . .). 

Remark 3.3 (Pseudo-metrics). Occasionally, we will encounter pseudo- 
metric spaces {X,r) [i.e., r{xi,X2) = for xi ^ X2 is possible]. The notion of 
the equivalence class from Definition 3.2 carries over to m arked ps eudo- 
metric measure spaces. Moreover, in the equivalence class {X, r, /i) of a 
marked pseudo- metric measure space (X, r, //), we always find an mmm- 
space (X',r',//'), such that the topology on X generated by r is in 1-1 
correspondence to the topology on X' generated by r' . That is, the open 
subsets of X can be mapped onto the open subsets of X' and vice versa. In 
particular, it is no restriction to use marked metric measure spaces instead 
of marked pseudo- metric measure spaces. 

In order to define an appropriate topology on M^, we introduce the notion 
of the marked distance matrix distribution. 

Definition 3.4 (Marked distance matrix distribution). Let {X,r,fi) be 
an mmm-space, ?c '■= {X, r, fj.) € and 

(3.3) : / (^ X /)^ ^ X 

[ ((xj,Mj)i>i) 1-^ i{r{xi,Xj))i<i<:j, {uk)k>i)- 

The marked distance matrix distribution of ?c= {X, r, fi) is given by 

(3.4) i/^ := (i?^^''')),^^ G Xi(]r(2) X I^). 



(3.6) Ro 
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Remark 3.5 (Distance matrix distribution is exchangeable). (1) Note 
that (i?(^'^)),^^ in the above definition does not depend on the particular 
element {X,r,fi) ?c= {X,r,^). In particular, v'^ is well defined. Moreover, 
by Theorem 1 in Depperschmidt, Greven and Pfaffelhuber (2011), we have 
;t = 1/ if and only if = . 

(2) Let 

(3.5) S := {c7:N^N|f7 is injective} 

be the set of injective maps on N. For o" S S, set 

' X ^ X I^, 

{{fij)l<i<j, {Uk)k>l) ^ ((?'a(i)A(7(i),a(i)V(7(i))> ('"<7{fc))fc>l)- 

Then, for € M^, the measure v'^ is exchangeable in the sense that 

(3.7) (i2,).i/^ = i/^. 

Definition 3.6 (Marked Gromov-weak topology). Let ^, ^, • • • € M^. 
We say that ?(n ^ K. n ^ oo m. the marked Gromov-weak topology if 

(3.8) v'^ i/^ 

in the weak topology on A^i(M_^^ x /^), where, as usual, R_^^ x is 
equipped with the product topology of IR+ and /, respectively. 

Several topological facts on the marked Gromov-weak topology were es- 
tablished in Depperschmidt, Greven and Pfaffelhuber (2011). One of the 
most important, showing that is a space suitable for probability the- 
ory, is that the space is Polish [Theorem 2 in Depperschmidt, Greven 
and Pfaffelhuber (2011)]. Before we state our results, we need to introduce 
several function spaces on M^. 

Definition 3.7 (Polynomials). (1) We denote by 

Bn := Sn(M® X /^), C„ := Cn(MP x I^), 

(3-9) 

the sets of bounded measurable (continuous, continuous and continuously 
difFerentiable with respect to all variables in MY^ ) functions (j) on K?' X F 



such that (r,u) i— )■ 4>{r,u) depends on the first (2) variables in r and the first 
n in n only. (If n = 0, the spaces consist of constant functions.) 

(2) A function $ : — )• M is a polynomial if, for some n G N, there exists 
(peBn, such that for all ;t G M^, 

(3.10) $(;t) :=«'"''^= = I ^{r,u)v''{dr,du). 
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(3) The degree of a polynomial ^> is the smallest number n for which there 
exists 4>&13n such that (3.10) holds. 

(4) Writing C° :=C„, we set 

oo 

n:=|Jn„, n„:={#"'<^:</.GS4, 

n=0 

oo 

U'^:=\JUI ^^={^."''^:<^GC^}, A; = 0,1. 

n=0 

We use the sets of polynomials as domains for the generator of the TFVMS 
process. In this context, we require that 11^ is an algebra that separates 
points, a result proved in Proposition 4.1 in Depperschmidt, Greven and 
Pfaffelhuber (2011). 

3.2. Martingale problem. In this subsection, we define the TFVMS dy- 
namics by a well-posed martingale problem. First we recall the notion of mar- 
tingale problems that we use here; see Ethier and Kurtz (1986). Throughout 
the following, / is assumed to be a compact metric space (and hence Polish). 

Definition 3.8 (Martingale problem). Let _E be a Polish space, Pq G 
A4i{E), T C B{E) and a linear operator on B{E) with domain T . The 
law P of an valued stochastic process X = {Xt)t>o is called a solution of 
the (Po,r2, J-") -martingale problem if Xq has distribution Pq, X has paths 
in the space Ve{[0,oo)), almost surely, and for all F & J^, 

(3.12) (piXt)- I nF{Xs)ds) 

\ Jo / t>0 

is a P-martingale with respect to the canonical filtration. Moreover, the 
(Po,r2, J^)-martingale problem is said to be well-posed if there is a unique 
solution P. 

As an example we now give the martingale problem characterization of 
the classical Fleming~Viot diffusion to prepare for the tree-valued process. 

Example 3.9 (The measure- valued Fleming- Viot process). We recall 
the classical Fleming- Viot measure- valued diffusion = (Ci)t>o with mu- 
tation and selection. It arises as the large population limit of the process 
describing the evolution of type frequencies = {Ct^)t>o in the Moran mod- 
els introduced in Section 2. The state space is ^Al{I), and Q describes the 
distribution of allelic types in the population at time t. 

The process can be characterized in various ways by a martingale prob- 
lem, for example, by a second order differential operator on C{A4i{I)) with 
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domain C^(A1i(/)), with an appropriate definition of the derivative. How- 
ever, our choice of an operator on polynomials reveals best the connection 
to the tree- valued process. 

Define the set of polynomials T on A4i(/) by letting T = U^o-^"' '^here 
Tn is the set of functions <I>:7Wi(/) — > M with <I)(C) = {C,®^ ,4>) for some 
(\) € C(I^), depending only on the first n variables. Define the linear operator 
on C(A^i(/)) with domain T 

(3.13) J7 = 17'''=^ + + O*^^'. 

Here, for <1> € Tn with $(C) = ^4>)^ the different terms are given as fol- 
lows: 

(1) For resampling rate 7 > 0, the resampling operator is defined by 

n 

(3.14) Ores^(^) = 2 ^(^®N^^„^^,^^_^^^ 

k,l=\ 

where the replacement operator O^^i is the map which replaces the Ith. com- 
ponent of an infinite sequence by the kth; that is, for u = {ui,U2, ■ ■ ■), 

(3.15) 

:= {ui,...,ui-i,v,ui+i,...). 

(2) For mutation rate > 0, the mutation operator is defined by 

(3.16) 0--*$(C)=^?^(C^^,^fc^), 

fc>i 

where, for some stochastic kernel /?(•, •) on /, 
(3.17) 

That is, B is the bounded generator of a Markov jump process on / with 
cadlag paths. It is always possible to write 

(3.18) /3(n, dv) = z'^{dv) + {I - z)^{u, dv) 

for some z G [0,1], /? € and a stochastic kernel /3(-,-) on /. We re- 

fer to the case z = 1 as parent-independent mutation or the house- of- cards 
model. The latter was introduced in Kingman (1978) who argued that muta- 
tions might destroy the fragile fitness advantage, which was built up during 
evolution, and lead to a replacement with an independent type. In this case, 

(3.19) /3(u, dv) = I3{dv) does not depend on n G /. 
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For z € (0, 1], we say that mutation has a parent-independent component. 
(1) For selection intensity a > 0, the selection operator is given by 



where Xk a-cts on the /cth coordinate. Note that selective events lead to 
replacements of individuals similar to resampling events [see also (6.12) and 
(6.13) in the case of Moran models]. However, the replacement operator O^^i 
does not appear in (3.20) and (3.23). The reason (in the haploid case) is that 
the chance that the fcth individual reproduces through a resampling event 
depends only on the fitness difference to a randomly chosen individual from 
the population. See also (6.19), (6.20) and (6.21). 

Given Pq G Mi{Mi{I)), it was shown in Ethier and Kurtz (1993) [see 
also Dawson (1993)] that the (Pq, ^2, J^)-martingale problem is well-posed. 
We refer to the solution as the (measure- valued) Fleming-Viot process with 
mutation and selection^ FVMS. This is a strong Markov process with con- 
tinuous paths and hence a diffusion. 

More general generators were considered in Dawson and March (1995), 
where state-dependent resampling and mutation rates were allowed. Selec- 
tion intensities depending on the state of the FVMS were considered in 
Donnelly and Kurtz (1999) and unbounded selection operators are studied 
in Ethier and Shiga (2000). In all these cases well-posedness of the corre- 
sponding martingale problem was shown. 

Definition 3.10 (Generator of TFVMS). We use the same notation as 
in Example 3.9. The generator of TFVMS is the linear operator on H with 
domain H^, given by 



(3.20) 



f^-i$(C) = a5;(C^^,^-xU+i-?-x; 



■n+l,n+2 



k>l 




(3.23) 0-i$(C) = a 5](C^^, ^- Xfc - • Xn+i), 



k>l 
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(1) We define the growth operator by 
(3.25) OS™"$(u) := {u", {V^<P,g)) 

with 



(3.26) 



l<i<j 



(2) We define the resampling operator by 

n 

(3.27) J)'--$(u):=^E<^"''^°^'i 

k,l=l 

with Ok^i{r,u) = {r_,9k,l{u)) [recall 9k,i from (3.15)] and 
(3.28) 

As an example, 



fif\k,iyki 
'l^jAkjVk: 



if i,i 

if J = ^, 
if i = /. 



(3.29) ei,3(z:,n) 



( rr2 ri4 rig 



ri4 ri5 




V 



, (ni,-U2,'Ui,M4,n5, 



(1) For the mutation operator, let •) be as in Example 3.9, and set 

n 

(3.30) n"'^''Hu):=^^{u'',Bk4>), 

k=l 

such that 

Bk(t> ■■= M - (f>, 
{M){l,u)-= / (l>{L,ul)P{uk,dv). 



[0,1] 



(3.31) 



(2) For selection, consider 
(3.32) x':^x^xIR+ 



with x'iu, V, r) = x'iv, u, r) for all u,v (z I ,r £ M+; recall (2.14). In our main 
results, we require that x' ^ C^'^'^{I x I x M+); that is, x' is continuous and 
continuously differentiable with respect to its third coordinate. Then with 



(3.33) 



x'k,liL,u) ■=x{uk,Ui,rkAi,kvl) 
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we set 

n 

(3.34) n^^'^iu) ■.= aY,{^\ 4> ■ x'k,n+i - • xUi,n+2)- 

fc=l 

If x'(^5 V, r) does not depend on r, and if there is x ^ ^ [0, 1] such that 

(3.35) x{u, V, r) = x{u) + x{v) 

[compare (3.22)], we say that selection is additive and conclude that with 
(3-36) Xk{L,u) = x{uk)- 

We obtain 

n 

(3.37) Q^'^'$(u) := a ■ J^(z.", • Xfc " ' Xn+i). 

k=l 

Now, we are ready to give our first main result. 

Theorem 1 (Martingale problem is weh posed). Let Pq G MiiV^), 11^ 
he as in (3.11) and il. as in (3.24)- 

(1) The (Pq, Vt,Jl^) -martingale is well posed. The unique solutionU := {Ut)t>o 
is called the tree-valued Fleming-Viot dynamics with mutation and se- 
lection (TFVMS). 

(2) The process hi has the following properties: 

(a) P(t is continuous) = 1; 

(b) P{Ut G for all t>0) = l; 

(c) u I-)- 'E[f{Ut)\Uo = u] is continuous for all f € C(U^), that is, U has 
the Feller property; 

(d) lA is strong Markov; 

(e) for $ = ^"'''^ e n^, the quadratic variation of the process ^{U) = 
{^{Ut))t>o is given by 

(3.38) mu)]t = jYl / W^%<l>-i<l>opi)-Ok,n+i-'i>-{<l>op'l))ds, 
where 

(3.39) Pi{L,u) = i{ri+n,j+n)l<i<j, (Ui+n)j>l) 

denotes the n-shift of the sample sequence. 

Remark 3.11 (Mark function). We will show in forthcoming work that 
states of the TFVMS only take special forms: 
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(1) Consider an mmm-space u= {U,r,fi) G U^. We say that u has a mark 
function if there is an U -valued random variable X and k:U I [both 
measurable with respect to the Borel-u-algebra of {U, r)] such that {X, k{X)) 
has the distribution fi. In other words, u has a mark function if there is a 
measurable function k:U ^ I with 

(3.40) fi{dx,du) = {{TTu)*fJ.)idx) ■ 5f,(^x){du). 

As argued in Remark 2.7, the TMMMS always admits states in which 
have a mark function. It turns out that the same holds for the TFVMS as 
well. 

(2) Another path property we will address are atoms of the measure 
Consider the TFVMS U = {Ut)t>Q with Ut = {U,r,n). Then, {ttu)*^ has an 
atom if and only if :r{x,y) = 0} > 0. We shall show that U only 
takes values in the space of mmm-spaces ?c = {X, r, fi) with the property 
that (vrf/)*// has no atoms. Note that only the projection (vr can be free 
of atoms since it is well known that (•Tr/)*/^ is atomic for all t >0, almost 
surely; see, for example. Theorem 10.4.5 in Ethier and Kurtz (1986). 

3.3. Girsanov theorem for the TFVMS. One possibility to establish the 
existence and uniqueness of martingale problems and to analyze its prop- 
erties is to show that solutions of different martingale problems are abso- 
lutely continuous to each other for finite time horizons. Uniqueness as well 
as several other properties (e.g., path properties) then carry over from one 
martingale problem to the other. The densities of the solutions of the martin- 
gale problems are calculated by the Cameron-Martin-Girsanov theorem for 
real- valued semimartingales [see Theorem 16.19 in Kallenberg (2002)] and 
Dawson's Girsanov theorem for measure- valued processes [Dawson (1993), 
Section 7.2]. Here, we carry out the corresponding program for TFVMS by 
considering two martingale problems with different selection strength. 

Remark 3.12 (Notation). For a S M+, we write and 0^^' for the 
operators defined in (3.24) and (3.34), respectively, when we want to stress 
the value of the selection coefficient a. 

Theorem 2 (Girsanov Transform for the TFVMS processes). Let a, a' € 
]R+, Po G Mi{V^), and using x'1,2 Z™™ (3.33) define ^ G 11^ by 

(3.41) ). 

7 

Let P G A^i(Ci[j/ (M+)) be a solution of the (Pq, ila, 11^) -mariingaZe problem, 
lA = {Ut)t>o the canonical process with respect to P, {J-t)t>o its canonical 
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filtration and 



(3.42) M = {Mt)t>o = ("^m - ^(^o) - r n^'^iKs) ds] 

\ Jo / 1 



t>0 

Then, Ad is a V -martingale and the probability measure Q, defined by 

dQ 

solves the (Pq, ri^', 11"^) -martmgaZe problem. 



^ ^Mt-{l/2)[M]t 



3.4. Convergence of Moran models. Our next task is to relate the Fleming- 
Viot process to the finite population models and their evolving genealogies 
on the level of trees, that is, mmm-spaces. 

Definition 3.13 (TMMMS). Recall the process ([/Ar,rf )j>o from 
Definition 2.5, started in a random mmm-space {U]\f,rQ,fiQ). The fitness 
function is either given as in Definition 2.2 or by (2.14). The tree-valued 
Moran model with mutation and selection ( TMMMS) is given by 



(3.44) U^' = {Unt>o, K = (C^iv,rf ). 

Theorem 3 (Convergence to TFVMS). LetU^ be the TMMMS, started 

in Uq , and U be the TFVMS, started in Uq. If ^=^Uq, weakly with 
respect to the Gromov-weak topology, then 

(3.45) U^'^'^U, 

weakly with respect to the Skorohod topology on Dui{[0,oo)). 

3.5. Long-time behavior. We now determine under which conditions the 
TFVMS has a unique invariant measure and is ergodic. This is not always 
the case, since already for the measure-valued process there are examples 
where the process is nonergodic. (A trivial example is = 0, but cases when 
mutation has several invariant distributions are also possible.) 

Recall the measure- valued Fleming- Viot process C = (Ct)t>o from Exam- 
ple 3.9 and the projection vr/ on / from Remark 3.1. Given Ut = {Ut,rt,iit), 
t>0, define the process 

(3.46) C := (Ci)i>o := ((vr/)*/it)t>o, 

and note that {Ct)t>o = {Ct)t>o if 'x'{u,v,r) = x'{u,v), that is, if the fitness 
is independent of the genealogical distance. Hence, existence of a unique 
equilibrium for is always implied by existence of a unique equilibrium for 
U. Theorem 4 shows that the opposite is also true. The proof of Theorem 4 
is based on duality, introduced in Section 5. 
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Theorem 4 (Long-time behavior), (a) Let U = {Ut)t>o be the TFVMS 
with Uq = u and C be as above. Then there exists an U^-valued random vari- 
able lAoo with 

(3.47) Ut'^U^, 

if and only if has a unique equilibrium distribution. 

(b) The law ofUoo is the unique invariant distribution oflA. It depends 
on all the model parameters but is independent of the initial state. 

In particular, if mutation and selection are present, > 0, a > and 
mutation has a parent-independent component (i.e., (3.18) holds for some 
z€ (0;1];, then (3.47) holds. 

Remark 3.14 (Conditions for ergodicity of (")• Various results about 
ergodicity of the measure- valued Fleming- Viot process have been obtained, 
which carry over to the TFVMS by Theorem 4. For example, under neutral 
evolution, a = (or x' = 0)i ergodicity has been shown if the Markov pure 
jump process on / with generator (3.17) has a unique equilibrium distri- 
bution [Dawson (1993)]. In the case a > and x' 7^ 0; ergodicity of Q in 
the case of no parent-independent component in the mutation operator [i.e., 
z = in (3.18)] have been shown in Ethier and Kurtz (1998) using coupling 
techniques. Using different techniques, Ethier and Kurtz (1998) also prove 
an ergodic theorem for a version of the infinitely-many-alleles model with 
symmetric over dominance. In Itatsu (2002) a perturbative approach is used 
to prove ergodicity of measure-valued Fleming- Viot processes with weak 
selection under ergodicity assumption on the mutation process. In Dawson 
and Greven (2012b) a set-valued dual [see also Dawson and Greven (2011)] 
allows one to prove ergodic theorems, even if the population is distributed 
on geographic sites if mutation has a parent-independent part. 

3.6. Application: Distance between two individuals. It is widely believed 
that genealogical distances under additive selection are smaller than under 
neutrality. The heuristics are that beneficial alleles spread quicker through 
the population than neutral ones by their fitness advantage. Hence, after the 
allele has spread, randomly chosen individuals have a more recent last com- 
mon ancestor than under neutrality. In other words, genealogical distances 
are shorter. However, shorter distances under selection are actually difficult 
to ascertain, because there is no monotonicity of genealogical distances in 
the selection coefficient a since the state of the process is due to an intricate 
interaction between the mutation and the selection. (Note that, as a — > oo 
the genealogies look essentially neutral since fixation on the fittest types 
takes place.) We cannot prove that genealogical distances are shorter under 
additive selection yet, but we make a first step in that direction. 
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Namely, we apply our machinery to the comparison of pairwise genealog- 
ical distances in the selective and in the neutral case. We give a concrete 
example how genealogical distances change under selection in the case of 
two alleles and if the selection coefficient is small. 

In order to make the comparison of distances precise, we proceed as fol- 
lows. Let be the unique invariant U^-valued random variable from The- 
orem 4 (if it exists). Let denote the distance of two randomly chosen 
points from U^. Hence, 

(3.48) has distribution A ^ E[(ri2)*i^"^ (A)] 

for Borel-sets A C IR_^, and ri2 denotes the function r ri2. In other words, 
the distribution of R12 is the first moment measure of the random probability 
distribution (ri2)*z^^^. For a > 0, the issue is now to decide whether < 
Ri 2 hi stochastic order. 

Remark 3.15 (Laplace-transform order and Landau symbol). (1) For 
two random variables X,Y, we say that X < y in the Laplace-transform 
order if E[e~'^"''"] > E[e~'^^] for all A > 0. Note that this does not necessarily 
imply that X <Y stochastically. 

(2) In the next theorem, we use the Landau symbol 0{-). In particular, 
for functions g and h, both depending on a, we write g{a) = h{a) + 0{a^) 
as a — )■ if limsup„_^o l(fi'('^) ~ h{a))/a^ \ < 00. 

The following theorem is dealing with the same case as Example 2.3. 

Theorem 5 (Distance of two randomly sampled individuals) . Let I = 
{•, }, x(^) = !{«=•}• Assume that the mutation rate is 'i?/2 and for the 
mutation stochastic kernel (3{-,-), 

(3.49) ^ • /3(n, dv) = | + ^1{.=.} 

for some -i?,,"!? > with = -|- , that is, • mutates to at rate 
i9,/2 and from to • at rate "d /2. In addition, selection is additive, that 
is, (3.37) holds for some a > and := ^oo is as in Theorem 4- [Note 
that (3{u,dv) does not depend on u, and therefore (3.18) holds with z = 1.] 
Let Rf2 be as in (3.48). 
Then as a — ?• 0, for A > 0, 

(3.50) E[e-^^?2] = ^— + fa^ + 0{a^), 

7 -I- 2A 

where / := /(7, -i? . ,A) is given by 

8-fi),'d {2-f + 2X + d)X 

^ ^(7 + 7?)(7 + 2A + ^) (67 + 2A + 7?)(7 + 2A)2 (67 -t- 4A 7?) ' 
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In particular, R^2 — ^12 '^^ Laplace-transform order for small a and 

Remark 3.16 [Distances under selection and connection to Krone and 
Neuhauser (1997)]. (1) Under neutrality, is exponentially distributed 
with rate 7/2, thus E[e~'*'^i2] = ^^2X • ^ote that for small a, the Laplace 
transform differs from the neutral case only in second order in a. The fact 
that the first order is the same as under neutrality was already obtained 
by Krone and Neuhauser (1997) for a finite Moran model. Our proof in 
Section 7.3 can be extended to obtain higher order terms. However, it is 
an open problem to show ii^g < -^12 stochastically for small a since the 
Laplace-transform order is weaker than the stochastic order. 

(2) The order i?°2 < -^12 cannot be expected to hold for all values ai < 02- 
The reason is that for large values of a, most individuals in the population 
carry the fit type • and therefore, the genealogy is close to the Kingman 
coalescent with pair-coalescence-rate 7. 

Outline of the proof section: before we come to the proofs of the The- 
orems 1-5, we develop three main technical tools. These are an analysis 
of the generator for the TFVMS (Section 4), duality (Section 5) and an 
investigation of the tree-valued Moran model with mutation and selection 
(Section 6). The proofs of Theorems 1-4 are given in Section 7 and the 
application. Theorem 5, is proved in Section 8. 

4. Infinitesimal characteristics. The TFVMS is a strong Markov process 
with continuous paths, and therefore may be called a tree-valued diffusion. 
Since generators of diffusions are typically second order differential opera- 
tors, it is natural to ask in which sense the same is true for the TFVMS 
with the generator from (3.24). Here it is useful to work with an abstract 
concept of order of linear operators. The distinction of first and second order 
terms is also the key to the proof of the Girsanov-type result, Theorem 2. 

4.1. First and second order operators. We recall some basic facts about 
linear operators, which are related to differential operators. For their con- 
nection to Markov processes see Fukushima and Stroock (1986) and Sec- 
tion VnL3 of Revuz and Yor (1999). 

Definition 4.1 (First and second order operators). Let be a linear 
operator with domain D and H C D an algebra. We say that is first order 
(with respect to H) if for all G H, 

(4.1) - 2$ • = 0. 
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We say that 0, is second order if it is not first order, and for all $ € 11 

(4.2) 0$^ + • 0$ - 3^> • 0^>2 = 0. 

Remark 4.2 (Diffusions in M'^ and higher order operators). (1) A dif- 
fusion process on M"^ has a generator 

(4.3) n = n, + n,, :=^;,,(^)^, n2=Y.af^{x)^^ 

i=l * *ij = l * 

with domain V = C^(M"'), for a vector (/ii)i=i,...,d and a positive definite 
matrix {a'ij)i<ij<d, which are continuous functions on W^. It can be easily 
checked that f^i is a first order operator, and is a second order operator 
with respect to V, according to Definition 4.1. Hence, the above definitions 
of first and second order operators extend the usual notions for differential 
operators. 

(2) The operator defined through the left-hand side of (4.1) is connected 
to the square field operator, also called operateur carre du champ, which is 
given by 

(4.4) r($, ^) := n^^ - - ^n<^>. 

In particular, a straightforward calculation (similar to the proof of Lemma 4.4 
below) shows that is second order if and only if F is a derivation [in the 
sense of Bakry and Emery (1985), i.e., r(<I>^', A) = ^T{^, A) + ^r(<I>, A) for 
ah $,^',AGn]. 

(3) Typically, higher order operators do not arise if D is a subset of 
continuous functions, and 0, is the generator of a Markov process {Xt)t>o 
with continuous paths. The reason is that ($(Aj) — Jq^}^{Xs) ds)t>o is a 
continuous martingale and therefore {^{Xt))t>o can only have quadratic 
variation, which means that Q is at most second order; see Proposition 4.5 
below. 

First and second order operators satisfy some further relations when ap- 
plied to products or powers, which we derive next. 

Lemma 4.3 (First order operators). // a linear operator O is first order 
with respect to the algebra 11, then 

(4.5) n{^-^)-^-m ■n<^ = o. 

In particular, (4-2) holds. 

Proof. Equation (4.5) follows immediately once we compute 0(<I> + \I')2 
and use linearity of 0. Furthermore, (4.2) follows by using ^' = $2 and (4.1) 
in (4.5). □ 



26 A. DEPPERSCHMIDT, A. GREVEN AND R PFAFFELHUBER 

Lemma 4.4 (Second order operators). If a linear operator il. is first or 
second order with respect to the algebra U, then for all ^ € 11 

(4.6) n^^"^ + 2^<^> ■n<^> + <^>^ ■ n^"^ - 2$ • = 0. 

In particular, for any <I> G IT, 

(4.7) n^^ + 8^^-n^-6<^>^-n<^>^ = o. 

Proof. Applying (4.2) to (^ + ^y^^ and — <I>)^, and summing up, 
gives 

= 2J1^'3 + 6J1^^>2 + • f]^ + 12^'$ • J^^> + 6^>^ • 

(4.8) -6"^ -n^^^ -6"^ ■n^'^ -i2<^ -m*^ 

= 617^'$^ + 12^'$ • n<^ + 6^>^ -n-^-e-if- n<^^ - 12$ • n-^/^, 

which imphes (4.6). To show (4.7), we use (4.6) with = and obtain 

= 0$'' + 2$^ • 0$ + $2 . J7$2 _ ^2 . ^^2 _ 2^ . 

(4.9) 

= + 8$^ • n<^ - 6$^ • 

since is at most second order. □ 

4.2. Order of operators: Application to Markov processes. In this sub- 
section we use the concepts of the last subsection to compute processes of 
quadratic variation and covariation for functionals of a Markov process. 

Proposition 4.5 (Path continuity of second order martingale problems). 
Let E be a Polish space, ft = Q^^'^ + f^*^^^ be a linear operator on 13(E) with 
domain T) C C{E), where O^^^ is a first order operator, and is a second 
order operator. Assume that T) contains a countable algebra U that separates 
points in E. 

Assume that X = {Xt)t>Q is a solution of the (Pq,^},!)) -martingale prob- 
lem for Pq G A4i{E) (with paths in De{[0, oo)) ). Then, X has the following 
path properties: 

(1) X has paths in C£;([0,oo)), almost surely; 

(2) /or $ € n, the process ^{X) = {(^{Xt))t>Q is a continuous semimartin- 
gale with quadratic variation given by 

(4.10) [^x)]t = f n^^^^^{x,) - 2$(x,) • n^^^^Xs) ds. 

Jo 

Corollary 4.6 (Covariation). Under the assumptions of Proposition 4-5, 
/e^ <I>, ^ € n. The covariation of the processes ^{X) = {^{Xt))t>o and ^{X) = 



TREE- VALUED DYNAMICS WITH SELECTION 



27 



(*(Xj))t>o is given by 

JO 

Proof. This is a simple consequence of (4.10) and polarization. □ 

Remark 4.7 [Connection to Bakry and Emery (1985)]. The path con- 
tinuity of functionals of X was already studied by Bakry and Emery (1985) 
using similar techniques. They show that {^{Xt))t>o is continuous for all 
$ G n if and only if the square field operator is a derivative [or if and only 
if is a second order operator; see Remark 4.2, item (2)]. We extend their 
result, since Proposition 4.5 gives a sufficient condition for path continuity of 
the process X (rather than of functionals of X). In order to show continuity 
of Af, we must require that the domain of contains a countable algebra 
that separates points. 

Remark 4.8 (Usual assumption on V). Usually, in order to guaran- 
tee that a solution of a martingale problem has paths in P£;([0,oo)), one 
requires that 'D{Q) is separating and contains a countable subset that sep- 
arates points; see Ethier and Kurtz (1986), Theorem 4.3.6. 

Proof of Proposition 4.5. The proof consists of three steps. First, 
we show that ^{X) is continuous, almost surely, for all $ € 11. To have a 
self-contained proof, we give the full argument here. However, note that 
continuity of ^{X) follows from Proposition 2 in Bakry and Emery (1985). 
Second, we establish that t ^ Xt is almost surely continuous. Third, we 
prove (4.10). 

Step 1: ^{X) has continuous paths: we use similar arguments as in the 
proof of Theorem 1.1 and Corollary 1.2 in Fukushima and Stroock (1986) as 
well as Kolmogorov's criterion [e.g.. Proposition 3.10.3 in Ethier and Kurtz 
(1986)]. Setting ^y(x) := $(x) — ^{y) and using that X solves the martingale 
problem for $7, we see that 

(4.11) EmXt)-^{Xs)f]=E[^j,^{Xt)]= fB[n^j,^{Xr)]dr<C{t-s) 

J s 

for some C < oo by the boundedness of Q.'^^. Moreover, by Lemma 4.4, (4.7), 
using (4.11) and some C" < oo, 

= E[^l,(Xi)] 
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(4.12) = f E[^l^{Xr){6n^j,SXr) -S^xAXr) ■^^xAXr))]dr 

J s 

<C' I E[{^{Xr)-^{Xs)f]dr<C' I {r-s)dr 

J s J s 

<C'{t-sf, 

and continuity of ^{X) follows. 

Before we carry the continuity of 1 1— > ^{Xt) for all <I> € 11 over to conti- 
nuity olt^Xt, we recall a basic topological fact: 

Remark 4.9. If n C C{E) separates points and x, xi, X2, . . . € K, where 
K C E is compact. Then, x„ ^^^^^ x in ii^ if and only if $(x„) "-l!^ for 
ah $ e n. 

The direction "=>" is trivial, since all <I>'s are continuous. For "<;=," note 
that {xi,X2, . . .} is relatively compact by assumption. Take any convergent 

subsequence Xn^. *— y. Clearly, for all $ G H, we have $(y) = 
limfc^oo ^{xrn,) = limn->-oo ^{xn) = ^{x) and hence, x = y since 11 separates 
points. 

Step 2: X has continuous paths: next we show that t>-^Xt is continuous 
as a function on [0,T] PlQ for all T > 0. Since E is Polish, P is regular and 
we can choose an increasing sequence of compact subsets of Ki,K2, . . . C 
with 

(4.13) P(Xt G K„ for all 0<t <r) > 1 - i. 

n 

Then set 

(4.14) a„ := {u : Xt{uj) G for all < t < T}. 

Moreover, take il.' with P(r2') = 1 and ^{X) is continuous on Q' for all 
$ G n. Set Q := il' n Un^i ^n, and note that this set has probability 1. 
Let a; G fi' n fin for some n and t G Q PI [0,T]. Then, for any ti,t2, ■ ■ ■ 

with tk t, Xt^{uj),Xt^{uj),... G Kn we have $(XtJa;)) ^{Xt{uj)) 

for all $ G n, and X^^, (uj) ^^^^ Xt{uj) follows as in Remark 4.9. Consequently, 
t>-^Xt{u}) is continuous for all t G Q PI [0,T] and hence is continuous for all 
t G [0,T], because X has sample paths in T)e{[0,oo)) by assumption. Since 
T was arbitrary, continuity of sample paths t>-^Xt follows. 

Step 3: proof of (4.10). Now, we show that the right-hand side of (4.10) 
is the conditional quadratic variation of ^'('-f). First note that since IS 
first order, 

(4.15) - 2$ ■ 0$ = _ 2$ . J7(2)$ 
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We use martingales (M<j,(t))(>o with 

(4.16) M^{t):=^{Xt)- f m{Xs)ds. 

Jo 

Now we decompose the square of the martingale 

(4.17) {M^{t)f = ^^{Xt) - 2M$(t) • ^ n^{Xs) ds - (^j^ 0$(X,) d^ . 
Next using partial integration we have 

{M^{t)f = M^2{t)+[ n^'^{Xs)ds-2 [ M<s,{s) ■n<^{Xs)ds 

(4.18) ^ ^\ 

-2 [ fm(xMM^{s)ds- ( [ ri$(x,; 



^s)ds 

I d, 



Jo 



With (4.15) we get finally 
Clearly, this is the decomposition of ^the submartingale M| into its mar- 
tingale parit/vfg^ji'p £rMipJ^^lp4)gr/ o^^^^y,^tA§y^ALd (4.10) follows. 

9.19) ^ ■'° ^ 

4.3. Operators for the tree-valued FV process. We apply the concepts 
of the last subsection to the different components of the generator for the 
TFVMS process. 

Proposition 4.10 (Order of generator terms of the TFVMS process). 
(1) The operators J^s™™, ^^^^ and $7™"* are first- order operators with respect 
to U\ 

(2) The operator Q'^^'^ is a second-order operator with respect to . More- 
over, for ^ = ^'"'"^ € n° and with from (3.39), 

J7res^2^„)_2$(u)-0'"^^$(u) 

(4.20) 

n 

= 7 5^ (^", <A • (<A ° Pi ) • Ok,n+l - <A • (<^ O P?)) • 
k,l=l 

Proof. Let ^><^ G n^. Then, using from (3.39), we show that r^^''"™, QF-^^ 
and $7™*^* are first-order operators by calculating 

^^^--^\u) = {v\{V,J,-{ct>op^^),2)) 

= {u\ (V^0,g) • o p'i) + {y\ Cp ■ {VrJ^ o p-),g)) 
= 2^>(u) •OS""$(u), 
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2n 



fc=l 
n 

= 2aY,{^\ <A • A,n+1 ■ (</> ° Pl^') - • X:.+l,n+2 • O Pi+')> 
A;=l 

n 

= 2a(l/", (/)) • ^^(l'", </> • X'k,n+1 - 'P ■ Xn+l,n+2) 
k=l 

2n n 
J^mut^2^„^ = ^(z.",i?fc(<A. (<Aop-))) =2^(l/",(i?fc<A) • (0op-)) 
k=l k=l 
n 

= 2{u\(p) ■^{u\Bk(p) = 2^{u)-Vr''^<^{u). 

k=l 

For ^l'^^^, Corollary 2.15 in Greven, Pfaffelhuber and Winter (2012) shows 
(4.20). Informally, the second-order term, as given in (4.20), arises by inter- 
actions between two samples, drawn independently from u. 

In order to establish Q^'''^ as a second-order operator, observe that all 
interactions between three independently drawn samples are due to inter- 
actions between pairs of samples. A formal calculation showing that Q'^'^^ is 
second order is as follows: 

-3<^{u)n'^''<^^{u) + 3^>2(u)0''^"«>(u) 

= -3$(u)(Jl''"'^>2(u) - 2^u)n''''^u)) - 3^^{u)n''''^{u) 

n 

= -3$(u)7 (^"' ^-i^^^Pi)- ^k,n+l - <^ • (<^ o p^)) 

k,l=l 

where we used (4.20) in the last step. Furthermore, 

3n 

f)-$3(^) ^ 2 ^ ^^u^ . „ ^n) . „ ^2n)) „ ^ 
k,l=l 

-./>.(c/.op-).(c/.opf)) 

k,l=l 

-<^-(</<op^).(</,opf)) 
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+ Y E ((-^ • ('^ ° ^1 )) ° ^'^^«^-') • ('^ ° ^?") 

k,l=l 

-(</<.(0op^).(</>opf))) 

n 

k,l=l 

Summing the last two displays, we see that $7''^*^ is second order with respect 
to n^, according to Definition 4.1. □ 

5. Duality. One of the main tools in studying the long-time behavior of 
a Markov process is to construct and to study a dual process H in the limit 
t — 7> oo. In this section, we define a dual process of the TFVMS process, 
which takes values in functions. Its state space is the following separable 
metric space [recall (3.9)]: 



(5.1) T:=Qc^ 



n=0 



and the duality function H(-,-) is 

^ l(»,o^^(«,e):=(^",e). 

We next define the Markov process H. The formal duality result is given in 
Proposition 5.3. 

Definition 5.1 (The function- valued dual process S). The process H = 
(Ht)t>o is a piecewise deterministic jump process with state space T. Recall 
that the mutation transition kernel has the form (3.18) for some z € [0, 1]. 
Here are the evolution rules: 

(1) Between jumps the process evolves according to the semigroup 

(5.3) {StO{L,u) = i{stL,u) 
with 

(5.4) {st{rij))i<i<j := (r^j + 2t)i<j<j. 

(2) To describe the resampling transition, we define 

(5.5) {^i{{l,u))) = ((ri-i{i>oJ-io>i}'^*-io>o))- 
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Then for n>l, the process jumps from the state ^ € C^(M4_^ x to 

(5.6) &ki^-=£,oOkioai at rate = 1, . . . ,re, 

(5.7) /SfcC at rate ??(! — z), A; = 1, . . . , n, 

(5.8) l^ki°'^k at rate "iJz, /c = 1, . . . , n, 

with 9ki from before (3.28), and fij.^ as in (3.18). Since does not 
depend on the kih. variable, we note that (z/", /3^^ o'^k) = (i^", w € U; 

see also (3.18), (3.31) and Remark 5.2 [item (3)]. 

(3) For haploid and diploid selection, (3.37) and (3.32), respectively, we 
use an operation 

(5.9) = ((ri+ij^>,jj+i^^>,j,Ui+i^^^^j)), 

which arises by deleting the fcth column and line from r and the klh. entry 
from u. Then we introduce jumps from ^ to (in the haploid and diploid case, 
resp.) 

(5.10) C • Xfc + (^oc^fc) ■ (1 - Xfc) at rate Q,/c = l,...,n, 

(5-11) ^•Xfe,„+2 + (^°^fe)-(l-Xfc,„+2) at rate a,A: = l,...,n, 

with Xk as in (3.36), n+2 ^ (3.33). (These transitions are reminiscent of 
the dual process {'r]t,Gt^)t>o from Dawson and Greven (2011). In particular, 
they differ from the construction given in Dawson and Greven (1999). See 
Remark 5.2 [item (2)] for the advantage of our construction.) 

(4) If ^ G Cq is constant, it stays in for all times. 

Remark 5.2 (Behavior of H and underlying birth and death process). 
(1) To better understand what is going on, look at the form of the function 
after the transition. For example, for (5.6), 

(0fczO(r,2i) = i{0ki{rij)i,j=i,2,..., 1-1,1,1,1+1,...-, {ui)i=i,..., 1-1,1,1,1+1,..)) 

(5.12) 

= i{{fij)i,j=l,2,...,l-l,k,l,l+l,...i {Ui)i=l,...,l-l,k,l,l+l,..)) ■ 

(2) In order to show that H is dual to the TFVMS (Proposition 5.3), 
we could as well have used a transition from ^ to ^ o Q^i instead of (5.6), to 
i-Xk + i-0--Xn+i) and to ^ • (Xfc^„+i + (1 - Xn+i,n+2)) instead of (5.10) and 

(5.11) , respectively. However, the above formulation has two advantages: 

► By (5.12), we see that 9^^ € C\^^ for ^ G C^. 

► We can show that ||"t||oo is nonincreasing (see Proposition 5.4). 
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(3) For the process H, consider the process {Nt)t>o, where A't = n if € 
C^. In the case of selection acting on haploids, the process jumps from n to 

n — 1 at rate j j + i!)z ■ n, 

(5.13) ^ ^ 

n + 1 at rate an. 

Note that the additional rate 'dz ■ n of decrease comes from the choice of 
transitions ^ — >■ /3^^ instead of ^ — > /J^^. The process {Nt)t>o plays (for 
z = 0) again an important role in Section 6.3 in estimating the numbers of 
ancestors of the total population. 

We can now state the duality relation between U and H. 

Proposition 5.3 (Duality relation). Let U = {Ut)t>o be the tree-valued 
Fleming-Viot process and H = (3t)t>o the function-valued process from Def- 
inition 5.1. 

(1) The set of functions {u i— > i?(u, i^) : € T} from (5.2) is separating 
on M^. 

(2) The processes lA, started inlA = u, and H, started in'EQ = ^, are dual 
to each other, that is, for H from (5.2) and t > 0, 

(5.14) ^,[H{Uui)]=^^[H{u,^t)]. 

Proof. For (1) we just note that {u H> :^ € T} = which is 

separating by Proposition 4.1 in Depperschmidt, Greven and Pfaffelhuber 
(2011). For (2) we have to show that [Ethier and Kurtz (1986), Proposi- 
tion 4.4.7] 

(5.15) (f^(-,e))(i^") = (J^duai(i^",-))(e), ueu^eeT, 

where $7 is the generator of U., and Ojuai is the generator of the dual pro- 
cess H. We begin by calculating the left-hand side. For ^ S C\, in the case of 
diploid selection (here the operators act on the first argument), we obtain 

n n 

^rc^^u^^^ = iY^ (i.",^ o ek,i - e) = I E (^"'^ ° ^^1°^^ - 

kd=l k,l=l 

n n 

(5.16) = ^zY,{^'\^,Coak -0 + ^(1 - ^) - 0, 

k=l k=l 

n 

= Xkn+l - ^ • Xn+l,n+2) 

k=l 
n 

= a^{l^", C • x'k,n+2 - O • Xk,n+2) 
k=l 
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due to the exchangeability of z^", where we have used that {u" , (3 = 
(z^", o afc) in fi™"*. Summing both sides of all terms in the last display 
exactly gives the left-hand side of (5.15). An analogous calculation shows 
this in case of haploid selection. 

Next we calculate the righ-hand side of (5.15). The generator of the 
Markov process H is easy to write down for functions of the form 3 

and u e Mi{M.]^' x I^). Let ^eCj, for some n = 0, 1, 2, . . . . 
First, consider the semigroup {St)t>o- Its generator is given by 

(5.17) {l^,O^WA'^rJ,g)). 

The other parts of the dynamics of H are pure jump. Hence, the generator 
of H acts on the above functions in the following way: 

n 

^du.i{^,o = (v^e,i)) +lYl ((^'©fc'O - {^,0) 

k,l=l 
k^l 

n n 

+{)zY,{{i^,^k^o^k) - (^,0) +^9(1 - z)Y,i{'^,PkO - {'^,0) 

k=l k=l 
n 

k=l 

in the case of diploid selection. An analogous expression holds for haploid 
selection. Combining the last display with (5.16) gives (5.15). □ 

The following is fundamental in using the dual process for the analysis of 
the long-time behavior of U. 

Proposition 5.4 (Long-time behavior of E). Let E = {Et)t>o be the 
dual process from Definition 5.1. Then, the following assertions hold: 

(1) 1 1— 7- lloo is a.s. nonincreasing; 

(2) if z € (0, 1], then Et converges to a random variable E^o which is a.s. 
bounded by ||Ho||oo; 

(3) there is an a.s. finite time T > such that Et does not depend on r. 

Proof. (1) By a restart argument and right-continuity of {Et)t>o, it 
suffices to show that ||Ht||oo < ||Ho||oo; almost surely. For this, we consider 
all transitions of the dual process. Between jumps it evolves according to 
the semigroup {St)t>o and, given Eq = 

(5.18) ||St^||oo = sup|e((r.,- + 2t)i<i<„n)|<||e||oo. 

{r,u) 
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If Hi_ = ^ and a jump occurs at time t, we have one of the following cases: 

Ptlloo = ||0fciC||oo = ll^o^fc; oo'dloo < llClloo, 

^iL,Uk)l^kiu, dv) 



\hi\\oo = sup 



<iieii 



oo ) 



(5.19) ||^Xfc + (eo^Tfc)•(l-Xfc)||oo 

<iieiioo-iixfc + (i-xfc)iioo = iieiioo, 

lie • Xfc,n+2 + CTk) ■ (1 - X'fc,„+2)lloo 

< llelloo • \\x'k,n+2 + (1 - x'k,n+2)\\oo = UWoc 

Hence, all transitions of H do not increase ||H, ||, and the result follows. 

(2) Considering all possible transitions, it is clear that for ^ G (see also 
Remark 5.2), 

(5.20) (■Xk + {Coak)-{l-Xk)^Ci^i, 

C • x'k,n+2 + ° CTfc) • (1 - x'k,n+2) ^ C\+2- 

Moreover, in the case z > and ^ € Ci, we have € Cq. Recall from Re- 
mark 5.2 [item (3)] that the process {Nt)t>o with A^j = n if G decreases 
at a quadratic rate and increases at a linear rate. In particular, there is an 
almost surely finite stopping time T with G Cq; that is, is constant 
with |Hr| < ||Ho||oo; see (1). 

(3) Note that any ^ G does not depend on r. As in (2), T = inf{t > 
: G C}} is almost surely finite, and we are done. □ 

6. The tree-valued Moran model with mutation and selection. In this 
section, we study the tree-valued process introduced in Section 2.3. In Sec- 
tion 6.1, we give the generator of the TMMMS from Definition 2.5, show 
convergence to the generator of TFVMS in Section 6.2 and obtain some 
characteristics of the TMMMS in Section 6.3. 

6.1. The martinga le problem for the TMMMS. Recall the TMMMS = 
{Ul^)t>o with := {Un , fi^) from Definition 3.13. Its state space is 

(6.1) Ujv:=M;^nU^ Uj^ := {{Xj^ e : N e M {X x I)} , 

where J\f{X x /) is the set of counting measures on X x /. Note that U^y is 
Polish as a closed subspace of the Polish space U^. 

In order to construct the TMMMS via its generator, we need to define 
its domain. The construction we use here is similar to the approach taken 
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in Sections 3.1 and 3.2, the main difference being that we have to sample 
individuals from finite populations without replacement. Compare analogous 
concepts from Definition 3.4. 

Definition 6.1 (Finite marked distance matrix distribution). Let a: = 

{X,r,fi)€Mj^. 

(1) The sampling without replacement from fx uses the measure 

^l^^^{dx,du) ■.= fl{dxi,dUl) ■ t^^l^M^^dX2,du,) 
(6.2) X l-J—I^^l-l—2^{dxN,duN) 



for (x,u) X I^. 

(2) We define 



(6.3) i2^'(^'^):<| iXxI)^^RP xl^, 

[ iixi,Ui)i<i<N) ^ iir{xi, Xj))i<i<:j<N, iuk)i<k<N), 

and let u^''^ denote the corresponding marked distance matrix distribution 

(6.4) i/^'^ := (i?^'(^'^))^^^^^ € 7Wi(m1^) x /^). 

Remark 6.2 (Marked distance distribution is well defined on U"'^). (1) 
As in Remark 3.5, for ?c = {X,r,fi) G M^, the marked distance matrix dis- 
tribution z/^'^ does not depend on the representative {X, r, fi) and hence is 
well defined. 

(2) Let ;t = {X,r,fi) e \ Mj^. Then, ^u®-^^ can still be defined as in 
(6.2), but is a signed measure. The same holds for u^''^. 

Now we can define the domain and range of the generator of the TMMMS. 

Definition 6.3 (Polynomials on Uj^). A function ^ : Uj^ — )• M is a poly- 
nomial if there exists (p ^ BCk)^ x I^) such that 

(6.5) $^(u) = (i/^'",<A) = / (N) cPir,u)i.^^"{dr,du). 



In this case, we set := <1>. As the space of all polynomials of this form is 
not an algebra, we define 



(^) 



(6.6) Un := algebra generated by : (f> G B{RY' x F^)}, 

(6.7) n]v := algebra generated by {<^% -^eCl (r'^^ x I^)} 
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where differentiability in CI{k^ x I^) is only required for the coordinates 
in CliR^). 

For the definition of the generator of the TMMMS recall the notation 
introduced in Definition 3.10 and (2.14). 

Definition 6.4 (Generator of the TMMMS). The generator of the TM- 
MMS with population size N is the linear operator $7^ on IlAr with domain 
given by 

(6.8) := nSrow,N ^ Qms,N ^ Qmut,N _^ QScl,N _ 

The growth and resampling operators are given by 

(6.9) f]g--'^cl>^(«) := (z/^'", (V^(^, 2)), 

N 

(6.10) 0'--'^$^(u) ■■=lY. ° ^'^'') - (^""'^^ '^))- 

k,l=l 

The mutation operator is given by 

N 

(6.11) f^--*'^c^.^(u) j;(i.^'",i?,.</>). 

k=l 

The selection operators in the cases of haploid and diploid selection are given 
by 

N 

(6.12) J7-l=A^$^(„):=^ ^(z.A^.",;^,(^o 0,,; -,/,)) 

k,l=l 

and 

TV 

(6.13) 0-l'^$^(u) := J] (^^'",xl,™(<^o^,,, -</.)), 



Ar2 



respectively. 



Remark 6.5 (Interpretation of generator terms). Clearly, the genera- 
tor terms $7^™™'^ and O'^'^'^'^ describe tree growth and resampling; see also 
Section 5.1 of Greven, Pfaffelhuber and Winter (2012) for the case without 
marks. The terms ri''"'^'^ and ri™"*'^ describe resampling and mutation aris- 
ing from the Poisson processes r/res and r/mut from Definition 2.2, respectively. 
For selection, recall t^scI from that definition. In the case of haploid selection, 
/ is replaced by an offspring of k at rate ax{uk) /N, for k,l = 1, . . . ,N, which 
easily translates into (6.12). The case of diploid selection is similar. 
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Proposition 6.6 (Well-posedness of TMMMS martingale problem). Let 
N eN, eMi{Vj^), n]v as in (6.7) and as in (6.8). Then, the 
(Pq , nj^) -martingale problem has exactly one solution, the tree-valued 
Moran model with mutation and selection. 



Proof. Existence is straight-forward from the graphical construction 
(see Definition 2.2 and Remark 6.5). In particular, the TMMMS solves 
the (Pq,^^, II^y) -martingale problem. To get well-posedness, note that 
the (P^, ri^"^"™'^, n]Y)-martingale problem is well posed. Furthermore B := 
Q^rcs,N _|_ ^mut,Af _|_ Qsei,N -g bounded jump operator (since the population is 

finite). Hence, uniqueness follows from Theorem 4.10.3 in Ethier and Kurtz 
(1986). □ 



6.2. Convergence of generators. Here, we prove that the sequence of gen- 
erators of the TMMMS defined in (6.8) converges (uniformly) to the 
generator for the TFVMS from (3.24). 

Proposition 6.7 (Generator convergence). For any <I> G H^ there is a 
sequence {^n)n£N such that ^'at € Hj^ for all N and 

(6.14) lim sup |^>Ar(u) -4>(u)| =0, 

(6.15) lim sup |J^^$7v(u) -J^^>(u)| =0. 

Proof. Let $ G H^. Then, by definition of H^, $ = for some n G N 
and G We define D^'" := (ln)*!^^'" for 

]r(2) X /^^M© X I^, 

{{fid)i<ij<N, {ui)i<i<n) ^ {{ri~N,j~N)i<i<j, {ue~N)i<e), 
where i ~ := 1 + ((i — 1) mod N). We define $Ar G H^ by setting 

(6.17) <^n{u) = (i/^'",</>o.jv) = {^'''\4>)- 

Then there is a constant C = C{n,(j)) > 0, such that for all N >n, 

(6.18) sup |<I>,v(«)-^(u)| = supK?^'"-z.",0)|<-^. 

To show (6.15) for <I> G H^ in the case q = 0, note that ^o^{u) = (i^",^) 
and ^n{u) = (t'^'", ^) for some tp G Cj^. Thus, in that case, (6.15) follows 
from (6.18). 

It remains to prove the convergence of the selection operators in haploid 
and diploid selection cases. We give the proof in the haploid case; the diploid 



3.16) ln 
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case is similar. For N >n, we have 



(6.19) 



k,l=l 

N N 

fc=l/=n+l 
N n 

+ E Y.^^''"'^xk{<t>o9k,i-^)). 



k=n+l 1=1 

Here the first summand on the right-hand side is of order N~^, and the 
second vanishes. Thus, we need to consider only the last summand. Define 
the swapping operator t^^i through the permutation a^^i := (1, . . . ,k — l,l,k + 
1, . . . ,1 — l,k,l + 1, . . . ,n) by Tk,i{r,u) := R^^ ^ [with an obvious extension of 
the operator from (3.6) to finite N]. Observe that for k > n, and / < n 
by exchangeability of i/^'", since cj) only depends on the first n indices, 



(6.20) 



(^?^'",(xr0)orfe,,) = (?^^'",X; 



'N,u 



Hence, for constants C = C{n,a,x,<P) not depending on u and possibly 
changing from line to line, by exchangeability of v^'"^ and (6.19), 



< 



(6.21) 



N 



a(iV-n)^^^„ 

Z^i'' ^Xk(l>-Xn+i( 

k=l 

n 

a^{i''',Xk4>-Xn+i4 



k=l 



c 



< 



N 



by the argument leading to (6.18). Since C does not depend on it, (6.15) 
follows. □ 



6.3. Bounds on the number of ancestors, descendants and pairwise dis- 
tances. Here we provide bounds needed to prove the compact containment 
condition for the TMMMS. We use the notation from Definitions 2.2, 2.5 

and 3.13. Most importantly, = {Ul^)t>o with = {UN,r^,l^^) is the 
TMMMS, and we use As{l,t) to denote the ancestor of {l,t) at time s. 
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The key to compact containment conditions for tree- valued processes aris- 
ing in the context of population models is to control the number of ancestors 
times e > in the past and the number of descendants of some given sub- 
population uniformly in the relevant parameter (here A^); see Section 7.1. 
For both we provide the needed estimates here. 

The following birth and death process, more precisely its infimum, serves 
as an upper bound on the number of ancestors in the Moran model with 
mutation and selection. 

Definition 6.8 (The processes J and J*). Let J = {Jt)t>o be the 
homogeneous Markov jump process which jumps 

from j to J + 1 at rate ja, 

(6.22) 

from j to J — 1 at rate 7 

Moreover, we define J'* = {J^)t>o by := info<<j<t J^. 

Proposition 6.9 (An upper bound for the number of ancestors). Let 
= {U^)t>o be the TMMMS as well as J* = {Jt)s>a from Definition 6.8, 
started in Jq = Jq = j G N. For < s < t and ni, . . . ,nj G Un pairwise dif- 
ferent, set 

(6.23) 4f ■.= #{Asinut):i=l,...,j}. 
Then 

(6.24) Aif<Jl_^ \fO<s<t,N en stochastically. 

Proof. Look at the graphical construction of the Moran model with 
mutation and selection at time t. Following the ancestral lines of ni,. . . ,nj 
backward, two things might occur at some time s: at a resampling arrow 
between two ancestral lines, these ancestral lines have a common ancestor, 
and A^'f decreases by one. The rate of such an event is proportional to 7 and 
the number of pairs. If an ancestral line hits the tip of a selective arrow, there 
are two possible ancestors, one of which is the real one depending on the 
types of the two. The process counts both of them which certainly gives 
an upper bound for the number of ancestors. This proves that Al'^ < Jt-s 
stochastically. Moreover, the number of ancestors can never increase when 
going back in time, and hence, A^^^ < J^_g follows. □ 

Corollary 6.10 (The number of ancestors of the total population). 

ForO<s<t, 

. wv,7V. < (7 + 2a)e(T'/^+")(*-^)iV tv^oo (7 + 2«)e(^/^+")(*-^) 
^ ^ - 2a + 7 + 7(e(^/2+")(*-«) - 1)A^ ~^ ^(e(7/2+a)(t-s) _ 1) ' 
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Proof. Set Jo = A^. Writing y{s) = E[Js] and using the backward equa- 
tion, we have 



(6.25) ijis) = aE[J,] - 7E 



where we used Jensen's inequahty in the last step. The solution of the initial 
value problem 

(6.26) i = (i7 + a)z-i7z2, z{0) = N 

is given by 

^ (7 + 2a)e(^/^+")-jV 
^' ' ^' 2a + 7 + 7(e(^/2+«)« - l)Ar- 

The last three equations together with Proposition 6.9 give the assertion. 
□ 

Our next task is to bound the frequency of descendants. 

Definition 6.11 (Frequency of descendants in TMMMS and filtration). 
(1) Let := (U^)t>o be the TMMMS with population size N defined by 
the graphical construction. For s <t and V C f/^y, we define 

(6.28) (V, s) := {/ G C/^v : A,{1, t) e V}, 

the set of descendants of V at time t. 

(2) For the TMMMS = {U^)t>o, recall the Poisson processes 'q'^^^r^"'^^ 
^sci on C/tv X M+ and 57v(t) = Un x (— c»,t] from Definition 2.2. We define 
the filtration {A?)t>, by = a(r?'--|s,w,r/-"*|s^{t),r/^^'|5^w). 

Lemma 6.12 (Bounds on the frequency of descendants). For < e < T 
there is 6 > such that for < s <T and any sequence {V^)NeN of A^ - 
measurable subsets ofUisf, we have 

(6.29) limsup/uf (V^) <(5 =^ limsupPf sup ^t(L»f (V^, s)) > e) < e. 

Proof. By time-homogeneity of the TMMMS, it suffices to show the 
assertion for s = 0. We restrict ourselves to the haploid case. The extension 
to the diploid case is straightforward. The proof is based on a coupling 
argument that we describe next. 

For € N, consider the graphical construction of = (Z^/^), given 
by means of the Poisson processes (jy^es^ ^mut^ ^sei^_ Moreover, let sat- 
isfy the assumption on the left-hand side of (6.29). We define a process 

= (U^)t>o with = {UN,r^ ,llt): taking values in U^*' * > with the 
following features: 

(i) for k G V^, set Ufc(O) = •, for k ^ V^, set Ufc(O) = • , 
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(ii) x{*) = ) = 0, that is, only • can use events in ?7***'\ 

(iii) -d = 0, that is, mutation is absent. 

For the dynamics of , use the same Poisson processes rj^'^'^ and rj^'^^ as . 
Note that {X^)t>o, given by = JIf {Dt{V^ ,0)) is a Markov jump process 
with transitions 

1 T 2 

from X to X + — at rate —N x{l — x) + aNx{l — x), 

1 7 2 

from X to X — — at rate —N x(l — x). 

In particular, {X^^)t>o converges weakly (with respect to the Skorohod 
topology) to the solution {Xt)t>o of the SDE 



(6.30) dX = aX{l -X)dt + ^-iX{l - X) dW. 

By construction of , we find that fit{Dt{V^ ,0)) < X^ , and hence, if 
limsup;v_^oQ fiQ {V'^) < 6 for some 6 > 0, then 

limsupPf sup ^f(A(V^,0)) >e) <limsupP( sup Xf > e 

(6.31) 

<pf sup Xs>e\Xo = 5). 
By Doob's maximal inequality, for each e > 0, we find (5 > such that 



(6.32) P sup X,|Xo = 5 <e, 

^0<t<T ' 

and the result follows. □ 

The next result is a corollary of the previous lemma and Proposition 6.9. 

Corollary 6.13 (Tightness of pairwise distances). Assume that {Uq) N&i 
is tight. Let (t) = (y^'^t' ^ 7-12) . For any e > 0, there isC = C{e) < 00 such 
that for all t>0, 

(6.33) limsupP(i?^(t) > C) < e. 

Proof. Let e > be given. For the process J from Definition 6.3 
with Jo = 2, let Ti = inf{t > : = 1}. As a birth and death process with 
quadratic death and linear birth rates, is recurrent and irreducible. Choose 
Ci > so that 

(6.34) pfT,>^)<e. 



For C2 > and Uq^ = {U^ ^r^ , ^q), consider the family of subsets of Uq^ 
:= {W C : r{gi,g2) < C2 for all gi,g2 G W}. 
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Clearly, contains maximal elements (with respect to "C"), and we 

denote by VF^ an arbitrary maximal element of . Set = Uq \ . 
By the tightness assumption and Lemma 6.12, we may choose C2 and 5 > G 
such that 

limsup;uf(y(^)<<5 =^ limsupP( sup ^uf (Z)f (F^)) > e) < e. 

N^oo N^oo ^s<t<Ci/2 ' 

To continue we have to distinguish whether t S [0,Ci/2] or not. 

For t G [0,Ci/2] the event {i?(^(i) > Ci + C2} means that the ancestral 
lines of a pair of individuals drawn at time t did not coalesce in the time 
interval [0,t] and that the distance of their ancestors at time is at least 
Ci + C2 — 2t > C2 . By the choice of C\ and C2 , we have 

(6.35) limsupP(i?^(t)>C7i + C2)<e for all t G [0, C7i/2]. 

In the case t > Ci/2 the event {iij^(t) > Ci} means that a randomly chosen 
pair of ancestral lines did not coalesce in the time interval \t — Ci/2,t], that 
is, 

(6.36) {<2(i) > ^^i} = {^-1/2,* = 2}- 

By Proposition 6.9 and the choice of C\ it follows that for t > Ci/2 (inde- 
pendent of A^), 

(6.37) P«2(t) > Ci) = = 2) < P (ti > ^) < e. 

Combining (6.35) and (6.37) we obtain (6.33) with C = Ci + C2. □ 

7. Proofs of Theorems 1, 3 and 4. Now we have all ingredients for the 
proofs of our main Theorems 1, 3 and 4. 

7.1. Proof of Theorems 1 and 3. We prove Theorems 1 and 3 simul- 
taneously. The main step in the proof is to show that the family of pro- 
cesses {li^ :N € N} is tight and that all limit points solve the (Po,^^,n-^)- 
martingale problem and fulfill (b) of Theorem 1 . Uniqueness of the solution 
of the (Pq, f^, n^)-martingale problem is a consequence of the duality rela- 
tion given by Proposition 5.3(2) [see Ethier and Kurtz (1986), Proposition 
4.4.7]. Note that the set of duality functions from (5.2) is separating on 
by Proposition 5.3(1). Finally, properties (a) and (e) from Theorem 1 are 
direct consequences of Propositions 4.5 and 4.10. 

In order to establish tightness of {lA^ : A'^ € N} and property (b) of The- 
orem 1, we use Lemma 4.5.1 and Remark 4.5.2 of Ethier and Kurtz (1986), 
requiring us to check two conditions: a convergence relation for generators 
and a compact containment condition. To verify the first, recall that we 
showed convergence of generators of TMMMS to the generator of TFVMS 
in Proposition 6.7. 
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Hence, we have to verify the second condition amounting to show the 
following compact containment conditions: for all e,T > there exist sets 
^£,T ^ U^, relatively compact in and rg^y C U^, relatively compact in U^, 
such that 

inf P(Wf G T for all e < i < T) > 1 - e, 

(7.1) 

inf P(U!^ G T for all < t < T) > 1 - e. 

For ?c = {X,r,fi), we set 7ri(^) := (X, r, (vrx)*/^)- Since / is compact, it is 
a consequence of Theorem 3 in Depperschmidt, Greven and Pfaffelhuber 

(2011) , that Ts^T ^ (r^^T ^ U^) is relatively compact in (U^) if and 
only if 7ri(r£^7-) [7ri(re^T)] is relatively compact in U (Uc). 

In order to check existence of T^^t {^£,t) such that (7.1) holds with 

replaced by i^iiUl^) and T^^t {^s^t) replaced by T^iiVg^T) Wi^^e^r)], we use 
Proposition 2.22 of Greven, Pfaffelhuber and Winter (2012). This result gives 
a condition for (7.1), based on estimates on the number of ancestors time 
e > in the past and in terms of frequencies of descendants of rare ancestors. 
First, we note that (vri {Uj^))t>Q fits the definition of a tree- valued version of a 
population model from Proposition 2.18 of Greven, Pfaffelhuber and Winter 

(2012) . For (i) of that proposition, the required bound on the frequency of 
descendants is given in Lemma 6.12. Moreover, (ii) of that proposition is a 
consequence of Corollary 6.10. Hence, (7.1) follows. 

Except for (c) and (d) of Theorem 1 the proof of Theorems 1 and 3 is 
complete by the above arguments. To prove the Feller property of lA, part 
(c) of Theorem 1, we use duality. Let W = {U^)t>o be the TFVMS started 
in Wq = u and u,ui,U2, . . . he such that Un ^ u in the Gromov-weak 
topology and let t > be fixed. First we note that for $ = ^""'"^ € H^ , 

E[$(Z^-)] = E[(z."r,^)] =E[(z.-,H,)] "^E[(^",Ht)] = E[cI.(Z^,")] 

by Proposition 5.3, where H = {'Bt)t>o is the dual process from Definition 5.1 
with ^0 = 4'- Hence, by Theorem 5 in Depperschmidt, Greven and Pfaffelhu- 
ber (2011), "^U^^ and the Feller property follows. 

For (d) in Theorem 1 notice that the strong Markov property follows from 
the Feller property by standard theory [e.g.. Theorem 4.2.7 in Ethier and 
Kurtz (1986), and note that local compactness of the state space is not used 
in the proof]. 

7.2. Proof of Theorem 4- As observed befo£e Theorem 4, a unique equi- 
librium for U implies a unique equilibrium for so we are left with showing 
the converse. 

If we have convergence from every initial point to a limiting law, then 
this law is the unique invariant measure of the process. In order to see 
that the limiting law is invariant, consider / € C(U^), and let {St)t>o be 
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the semigroup of the TFVMS. Since the map u i— ?• {Stf){u) is continuous by 
Theorem l.c, the Hmiting law is invariant using the same argument as in 
Proposition 1.8(d) of Liggett (1985). Hence we have to estabUsh the conver- 
gence statement. Recall that the family {u i— >■ :C ^ is separating 
see Proposition 5.3. Hence we have to show two assertions [see, 
e.g., Ethier and Kurtz (1986), Lemma 3.4.3]: 

(i) The family {Ut:t>l} is tight in U^. 

(ii) For all C^T, limt_>oo Eu[(i/^% ^)] exists and does not depend on u. 

When these two properties hold, we conclude from (i) that there are con- 
vergent subsequences of {Ut)t>o- Let u G M/ and ii,t2) • • • be such that Uoo 
is the weak limit of {h{t„)n=i,2,..., started in u. Then, (ii) implies that, for all 
$ G H^ with ^>(«) = and ^ G T 

(7.2) E„[$(Z^oo)] = lim E„[(z."*",0] = lim E„[(z."%0] 

exists and is independent of u. 

We start by proving (i). By Theorem 4 in Depperschmidt, Greven and 
Pfaffelhuber (2011), we need to show that {Tri{Ut) : i > 1} is tight in Uc- For 
this, we use Proposition 6.2 of Greven, Pfaffelhuber and Winter (2012). In 
particular, we have to check that: 

(1) {RUt):t>l} is tight, 

(2) {At-£^t : i > 1} is tight for < e < 1, where At-e^t from Definition 2.2 
is the number of ancestors of Ut at time t — e, or, equivalently, the number 
of 2e-balls needed to cover Ut ■ 

Once (1) and (2) are shown, let 6 > 0. It is straightforward to construct a 
set C Uc which fulfills (i) and (ii) of Proposition 6.2 of Greven, Pfaffelhu- 
ber and Winter (2012) with inft>i P{Ut G F^) > 1 - 5. While (1) is true by 
Corollary 6.13, (2) holds according to Corollary 6.10. 

We now show (ii) if C has a unique equilibrium. Consider the process 
H = ('Et)t>Q from Definition 5.1. Recall from Proposition 5.4(3) that there 
is an almost surely finite T such that does not depend on r. We use the 
duality relation from Proposition 5.3 and the strong Markov property of H 
to see that for Sq = ^ G T , 

hm E„[(Z."S0] = lim E^[(z.", H*)] = lim B^[B^^[{,,\Et)]] 

(7.3) 

= lim / Ej(i."*,e)]P5(HTGde) 

exists and does not depend on u^This holds since for ^ G T, not depending 
on r, the limit limf_j.oo E„[(i/^*, ^)] exists and is independent of u since C, 
has a unique equilibrium. Note that 1 1— > ||H(||oo is nonincreasing by Propo- 
sition 5.4(1) and therefore, all expectations in (7.3) are well defined. 
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Next, we show that (ii) holds if ?? > 0, a > and mutation has a parent- 
independent component, again using the dual process H = {'Bt)t>o from 
Definition 5.1. From Proposition 5.4(2) we know that H converges almost 

surely to a (random) constant function Hoo taking values in Cq. Hence, for 
Ho = C€T, 

(7.4) hm E„[(z."SO] = lim B^[{u\Et)] = E^[{u\E^)] = Ei:[E^], 

t— >-oo t— i>oo 

where the expression on the right-hand side does not depend on u. Again, 
note that 1 1— >■ ||Ht||oo is nonincreasing by Proposition 5.4(1) and therefore, 
all expectations in (7.4) are well defined. Hence, (ii) follows if either is 
ergodic or if mutation has an independent part, and this completes the 
proof of Theorem 4. 

7.3. Proof of Theorem 2. Before we turn to the proof of Theorem 2, we 
recall the Girsanov transform for continuous semimartingales from Kallen- 
berg (2002), Theorems 18.19 and 18.21. 

Lemma 7. 1 (The Girsanov theorem for continuous semimartingales). Let 

A4 = {Mt)t>Q he a continuous -martingale for some probability measure 
P, and assume Z = (Zt)t>0! given by Zt = e^^*~(^/^)[-'^l* , is a martingale. 
If M = {Nt)t>o is a local P -martingale, and Q is defined via its Radon- 
Nikodym derivative with respect to P, that is, = Zt, then M — [Ai,J\f] 

is a local Q-martingale. (Here, [AijM] is the covariation process between 
M and N and [M] = [M,M] .) 

Proof of Theorem 2. Since |a' — a| < oo, 7V( is bounded, and there- 
fore the right-hand side of (3.43) is a martingale. Thus Q is well defined. 

By Theorem 5 from Depperschmidt, Greven and Pfaffelhuber (2011), H^ 
contains an algebra that separates points, so the TFVMS fulfills the as- 
sumptions of Proposition 4.5. The generator 0^ is second order by Propo- 
sition 4.10, and its only second order term is fi''^'^. In particular, we can use 
Corollary 4.6. This is important since the additional drift term introduced 
by the Girsanov change of measure is given by a covariation; see Lemma 7. 1 . 
We have to compute mU),^{U)] for ^{U) = {^{Ut))t>oMU) = {^{Ut))t>Q 
for $ € H^ and ^ from (3.41). We take $ € H^ and compute, using the 
symmetry of x\ 

J)'"*^^($(«) ■ ^(h)) - ^{u) ■ Q'-"^$(«) - $(u) • Q:''''^{u) 

= (17'--(Z.«,0. (x'i,2 0p?)) - (l^",x'l,2) 

7 

^■•-(^.",X'1,2)) 
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a' — a 
2 



f n+2 

E^^"''^-(^'l>2-/'?)°^M-'/'-(x'l,2 0P?)) 



fc,Z=l 

- 2{V\ • (x'i,2 ° ^1,2 ° P?) - </> • (X'l,2 ° Px)) 

n 

{a - a) E(l/", (0 • Xn+l,n+2) ° 6'fc,n+l " ' Xn+l,n+2) 



fc=l 

n 



{a' - a) E(l/", • Xfe,n+1 - ■ Xn+l,n+2) 



fc=l 

Hence, Corollary 4.6 implies that 

(7.5) [<i>(^),A^]t = mu),^{u)]t = [\n':}mis) - nfHUs))ds, 

Jo 

where U = {Ut)t>o is a solution of the (Pq, n^)-martingale problem. For 
any <1> G H^, 

(7.6) M^:={^{Ut)- I ^c,'^{Us)d^ 

\ Jo J t>0 

is a continuous P-martingale. Thus, by Girsanov's theorem for continuous 
semimartingales. Lemma 7.1 and (7.5), we see that 



Jo 

Jo 



t>0 



is a Q-martingale for Q defined by (3.43). Since $ G 11^ was arbitrary, it 
follows that Q solves the (Fq, ^2^/, n^)-martingale problem. □ 

8. Proof of Theorem 5. If is as in Theorem 5, the proof is based on 
the fact that 

(8.1) B[n^^UZ)] = 

for $ G n^. (This follows easily from the r^Q-martingale problem.) Moreover, 
for small a > 0, the equilibrium is close to the equilibrium without 
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selection, and the equilibrium under neutrality is well understood. In order 
to use this knowledge for the neutral case, the following fact is fundamental. 

Lemma 8.1 (Continuity of ai-^U^). LetU^ he as in Theorem 5. Then, 
for $ € 

(8.2) E[$(W^)]-E[$(ZY^)] = 0(«) as«^0. 

Proof. First, note that mutation is parent-independent here, z = 1. Let 
$(tt) = with (j) G C^. Recall from the proof of Theorem 4 [see (7.4)] 

that E[<I>(ZY^)] =E^[HJ^], where (Hf)t>o is the dual process with selection 
coefficient a and Hq = (j). For the proof, we couple the dual processes for 
selection coefficients a and using the same transitions as given by (5.3), 
(5.6) and (5.8). Recall that there is a random time T < oo such that = 
for i > T and = H^. The only difference between (Ef)t>o and (E^)t>o is 
that only the former process can make transitions given by (5.10) or (5.11). 
Hence, for the coupled process, we get = if no such transition occurs 
before time T. Consider a time s when € C].. By (5.13), the chance that a 
selective event occurs until time t when Ef € is (recall z = 1) ak/{ak + 
7(2) +'&k). Hence, for some finite C,C' > 0, depending only on cp and 

\BMU^)] - E[$(W^)]| = lE^H^] - E4E'J\ 

(8.3) <C-P[E"^^El] 

n 

for small a and the result follows. □ 

We start more generally than needed in the proof of Theorem 5. In par- 
ticular, given r is the distance matrix of an ultrametric tree, we define tree 
lengths for subtrees of any finite number of leaves. 

Definition 8.2 (Tree lengths and test functions) 
define 

n 

(8.4) £n{L)= inf 

where S„ C S„ is the set of permutations of N leaving ?i -|- 1, n -|- 2, . . . con- 
stant and having exactly one cycle on 1, . . . , n. 
(2) For fixed A > let (fifj G Cl^_^j be of the form 

(8.5) <P^{l,u) = e~^-'-^'=> ■ l{u,=.} ■ ■ ■ • l{«„+i=.} • • • !{«„+,=.}• 

For consistency, we define (pQQ := 1. Moreover we set := <I)"~'"-'''^ii . 



(1) For r G M^2^ we 
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Remark 8.3 (Interpretation). (1) If r is the distance matrix arising by 
sampling points xi,X2, ■ ■ ■ from an ultrametric space {U,r,fi), it was shown 
in Lemma 3.1 of Greven, Pfaffelhuber and Winter (2012) that iniz) gives 
the subtree length of the subtree spanned by xi, . . . , x„. 

(2) Considering $i^ (w) as a function of A gives the Laplace transform of the 
subtree length of n sampled points from u on the set where i points within 
the subtree and an additional number j outside the subtree carry allele •. 
In particular, (/)^- depends on the first n + j points, and hence <I>^- € Cj^^j. 

8.1. Equilibrium distances under neutrality. The action of Q on functions 
given in Definition 8.2 has a particularly nice form for a = 0. Recall that 
Oq, denotes the generator given in (3.24) for a > 0. 

Lemma 8.4 (Action of $7o on $^). Let a = and be as in Defini- 
tion 8.2. Then 

+^((2) ('^fe^ -'^S) 

(8.6) 

+ (n - i)j($r+i,i-i - + (^2) (^^i-i - ^^^■)) ■ 

Proof. First, observe that for n > 2 
(8.7) (V^e-^-^"(^\|) = -nX ■ e"^-^"fc), 

which explains the first term on the right-hand side of (8.6). Mutation to 
• occurs at rate ^ and to • with rate — f-. Hence, for cj) £ ^{I) 

{8.8^B<p{u) = ^^,,^.y{<P{ )-</>(.)) + ^_(l-l^_.j)(0(.)-<^( )). 

In particular, 

(8.9) = -yl{„=.} + —(1 - !{„=.}) = — - -!{„=.}. 

Since the mutation operator acts on all components in cpfj separately, we 
obtain the second and third term in (8.6). Finally, resamphng can happen 
between any of the ("^"') with different results within and outside the subtree 
and the result follows. □ 
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Let he distributed as in 



Proposition 8.5 ($^- under neutrality) 
Theorem 4 with a = and the mutation given by (3.49). Then 

i9 . 



8.10) 
8.11) 



8.12) 
8.13) 
8.14) 
8.15) 

8.16) 



^.17) 



7 

■ 7 + 2A' 



+ . ' 

^ +7 



7 



7 + 2A + 
7 7 + 2A + -i? 



7 + 2A 7 + 2A + t9, + i9. 

7 



^, + '& 37 + 2A + + 7? . 



+ 



7 + 2A 

7 



+ 



7 + + 1? . 

7 + 2A + 19 



7 + 2A 7 + 2A + ??, +i9 

7 



+ 67 + 2A + + 7? . 



7 + 2A 



+ 



+ 
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37 + 2A + + 1? . 

7 + 1? 9 7 + ^9 



+ 



7 + 2A 

7 



+ 



7 + - 
7 + 2A + ?? 



7 + 2A 7 + 2A + 19, + ?? 



Proof. The proof is based on (8.1) for the special choice of functions 
as in Definition 8.2. Clearly, (8.10) holds since <^\q{U^) = 1 by definition. 
The left and the middle expression in (8.11) both give the probability that a 



single chosen individual has the "-allele. This is 



as can, for example, 



be seen from competing Poisson processes along the ancestral line of the one 
chosen individual (or a generator calculation). 
In the rest of the proof, we abbreviate 

(8.18) $r,:=E[$r,(W^)]- 
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We have, using Lemma 8.4 for and 



^.19) 

which imphes (8.12). For (8.13), the only nonvanishing resamphng term in 

■00' 

=.2 I _vi ^2 



= (^ ^l-i}^l,) + ^{<^>l^-^l 



(8.6) is the one with rate ("2 hence, applying Lemma 8.4 for 



(8.20) o = -2A$^o + 7(l-^oo), 

and the result follows. [Of course, (8.13) can also be shown by the fact that 
the MRCA of two sampled individuals in equilibrium has a coalescent time 
which is exponential with rate 7.] 

Let us turn to (8.14). We find from (8.6), 

= -2X^1, + 1(^ c^2^ _^ci,2^) +^(ci>i^ _ $2^), 

(8.21) 

= -2X^1, + i(i9 <!>lo - ^^li) + 7(^Si - Hi + 2<&?o - 2^,). 

From the difference of the last two equations, the first equality in (8.14) 
follows. Solving the first equations for <I>^q and using (8.11) and (8.12) then 
gives the second equality in (8.14). [Again, we remark that (8.14) is not 
surprising: as well as give the Laplace transform for two randomly 
chosen points, given one of the points or a third point has type •. Following 
back the ancestral line of the latter point shows that the Laplace transform 
is independent of the type of the other chosen individual.] 
Next, we have 

(8.22) 0=-2XHo + {^ «&?o-^^2o) + 7(«'lo-'&io). 

which shows (8.15). For (8.16) and (8.17), we have the pair of equations 



.23) 



+ 7(^}i - + ^?o - + ^20 - ^ 



= -2X^2 + ^01 - 



02; 



+ 7(^02 - ^02 + ^01 - ^02 + 4$?i - 4$g2)- 



Solving this linear system (e.g., by using Mathematica) gives the asser- 
tions. □ 

8.2. Proof of Theorem 5. First, by Lemma 8.1, E[$(ZY^)] = E[$(ZY^)] + 
0{a) for a — 7- 0. Hence, by applying (8.1) to the function "3>Qg from Defini- 
tion 8.2, 

= -2XE[H,{UZ)] + 7 . E[l - Horn] 

(8.24) 

+ 2aE[<^>l,m-<^>l(U^)]. 
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Since 

(8.25) m\o{U^) - ^lim] = mUul) - + 0{a) = 0{a) 
by Lemma 8.1 and Lemma 8.4, we find that 

(8.26) E[<I>2o(Z^^)] = ^— + 

7 + 2A 

Now, in order to compute E[$fg(ZY^) — ^q^{U^)] more accurately, up to 
second order in a, we apply the equilibrium condition (8.1) on <l>fg — 
and obtain, since <^*}o ~ '^on 

- Ei^Uu^) - ^IM)] - 2E[$?o(Z^^) - 

(8.27 

- B[2^l,{U^) + '^lAl^Z) - 3^g2(^^)]) 
-2A - I - 3j^E[<^Uu^) - ^1,{UZ)]+^B[^UUZ) - <!>UU^)] 

+ aE['^l,jUZ) - + ^1,{U^) - 4$2^(Z^° ) + 3$g2(^^)] 

-2A-^-37)E[$?ora-^oi(^^)] 

+ aE[$2o(^^) - 4$2^(W^) + 3c^2^(^^)] + 0{a^). 
In particular, under neutrality, by Proposition 8.5, 

E[<^l,{U'^)-4^l,{U'^) + 3<^l,{U'^)] 

(8.28) 

27t?.i9 (27 + 2A + t9) 



??(7 + t9)(7 + 2A + t9)(67 + 2X + ^) 
Now, combining (8.24), (8.27) and (8.28), we see that 



-A. 



7 + 2A 



7 + 2A 



2a2 



(7 + 2A)(37 + 2A + (1/2)^9) 
+ (!?(a=^) 
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87^. (27 + 2A+^) 

~ t9(7 + ??)(7 + 2A + ^) (67 + 2A + ??)(7 + 2A)2 (67 + 4A + ^) " 

and the assertion follows. 

APPENDIX: NOTATION 
We collect the most important notation here: 

► A^: population size of Moran model (Section 2), 

► /: type space, compact metric space (Section 2), 

► Un:={1,..., N} (Definition 2.2), 

► Sn ■= Rn X [0,00) (Definition 2.2), 

► As{l,t) (zUn: ancestor of individual I at time s (Definition 2.2), 

► rj: Poisson processes (Definition 2.2), 

► 7: resampling rate (2.2), 

► -d: mutation rate (2.3), 

► /3(u,dv) transition kernel on / for mutation (2.3), 

► /3,/3: two components of /3 for a parent-independent part (3.18), 

► a: selection coefficient (2.4), 

► x(^^); x'('"5 f^): haploid fitness of type n and diploid of {n, f} (2.4), (2.5), 

► x^x'- fitness functions for measure-valued process (3.21), (3.22), 

► M^: set of marked metric measure spaces (3.1), 

► U^U^: state space of the processes (Definition 3.2), 

>■ X.= {X, f,^J)■,u = {U, r, fi): generic elements of HJ-'^ (Definition 3.2), 

► J7: generator of the measure-valued Fleming-Viot process (3.13), 

► n-. generator of the TFVMS, also Qa (3.13), 
^ U={Ut)t>o: the TFVMS (Theorem 1), 

► Z^^ = {Ut^)t>o- the TMMMS (Definition 3.13) 

► Uoo '■ long-time limit of U (Theorem 4) , 

► C^: measure-valued Moran model (2.6), 

► (^: measure- valued Fleming- Viot process (Example 3.9), 

► if.E^E': embedding (Remark 3.1), 

► v'^: distance matrix distribution (3.4), 

► S: set of permutations (3.5), 

► 0: resampling operator (3.15), 

► R(j: map exchanging indices according to permutation a (3.6), 
^ $ = <|)".'^: polynomial (3.10), 

► n,n^: set of polynomials (3.11), 

► Cfc) '^k'- shift operators (5.5), (5.9), 

► p^: shift operator (3.38), 

► Hn- polynomials for finite populations (6.6), 
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► R12: distance of two randomly chosen points (Remark 3.15), 

► T: state space of function- valued dual process (5.1), 

► H: dual process (Definition 5.1), 

► in- tree length for n individuals (8.4). 

Acknowledgments. We thank Anton Wakolbinger for fruitful discussion 
and Steve Evans for pointing us to the paper of Bakry and Emery (1985). 
Part of this work has been carried out when A. Depperschmidt was taking 
part in the Junior Trimester Program Stochastics at the Hausdorff Center 
in Bonn: hospitality and financial support are gratefully acknowledged. 

REFERENCES 

Bakry, D. and Emery, M. (1985). Diffusions liypercontractives. In Seminaire de 
Prohahilites, XIX, 1983/84. Lecture Notes m Math. 1123 177-206. Springer, Berlin. 
MR0889476 

Barton, N. H., Etheridge, A. M. and Sturm, A. K. (2004). Coalescence in a random 
background. Ann. Appl. Probab. 14 754-785. MR2052901 

Bertoin, J. and Le Gall, ,J.-F. (2003). Stochastic flows associated to coalescent pro- 
cesses. Probab. Theory Related Fields 126 261-288. MR1990057 

Bertoin, J. and Le Gall, J.-F. (2005). Stochastic flows associated to coalescent pro- 
cesses. II. Stochastic differential equations. Ann. Inst. Henri Poincare Probab. Stat. 41 
307-333. MR2139022 

Bertoin, J. and Le Gall, J.-F. (2006). Stochastic flows associated to coalescent pro- 
cesses. III. Limit theorems. Illinois J. Math. 50 147-181 (electronic). MR2247827 

Dawson, D. A. (1993). Measure-valued Markov processes. In Ecole D'Ete de Probabilites 
de Saint-Flour XXI— 1991. (P. L. Hennequin, ed.). Lecture Notes m Math. 1541 1-260. 
Springer, Berlin. MR1242575 

Dawson, D. A., Greven, A. and Vaillancourt, J. (1995). Equilibria and quasiequilib- 
ria for inflnite collections of interacting Fleming- Viot processes. Trans. Amer. Math. 
Soc. 347 2277-2360. MR1297523 

Dawson, D. A. and Greven, A. (1999). Hierarchically interacting Fleming-Viot pro- 
cesses with selection and mutation: Multiple space time scale analysis and quasi- 
equilibria. Electron. J. Probab. 4 81 pp. (electronic). MR1670873 

Dawson, D. A. and Greven, A. (2011). Duality for spatially interacting Fleming-Viot 
processes with mutation and selection. Available at arXiv:1104.1099. 

Dawson, D. A. and Greven, A. (2012a). Multiscale analysis; Fisher-Wright diffusions 
with rare mutations and selection, logistic branching system. In Probability in Complex 
Physical Systems: In Honour of Erwin Olthausen and Jiirgen Gartner (J.-D. Deuschel, 
B. Gentz, W. Konig, M. von Renesse, M. Scheutzow and U. Schmock, eds.). 
Springer Proceedings in Mathematics 11 373-408. Springer, Berlin. 

Dawson, D. A. and Greven, A. (2012b). On the effects of migration in spatial Fleming- 
Viot models with selection and mutation. Unpublished manuscript. 

Dawson, D. A. and March, P. (1995). Resolvent estimates for Fleming-Viot operators 
and uniqueness of solutions to related martingale problems. J. Fund. Anal. 132 417- 
472. MR1347357 

Delmas, J.-F., Dhersin, J.-S. and Siri-Jegousse, A. (2010). On the two oldest families 
for the Wright-Fisher process. Electron. J. Probab. 15 776-800. MR2653183 



TREE- VALUED DYNAMICS WITH SELECTION 



55 



Depperschmidt, a., Greven, a. and Pfaffelhuber, P. (2011). Marked metric measure 

spaces. Electron. Commun. Probab. 16 174-188. MR2783338 
Donnelly, P. and Kurtz, T. G. (1996). A countable representation of the Fleming-Viot 

measure-valued diffusion. Ann. Probab. 24 698-742. MR1404525 
Donnelly, P. and Kurtz, T. G. (1999). Genealogical processes for Fleming-Viot models 

with selection and recombination. Ann. Appl. Probab. 9 1091-1148. MR1728556 
Etheridge, a. (2001). An Introduction to Super-processes. Amer. Math. Soc, Providence, 

RI. 

Etheridge, A. M. and Griffiths, R. C. (2009). A coalescent dual process in a Moran 
model with genie selection. Theoret. Population Biol. 75 320-330. Sam Karlin: Special 
issue. 

Etheridge, A., Pfaffelhuber, P. and Wakolbinger, A. (2006). An approximate sam- 
pling formula under genetic hitchhiking. Ann. Appl. Probab. 16 685-729. MR2244430 

Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Con- 
vergence. Wiley, New York. MR0838085 

Ethier, S. N. and Kurtz, T. G. (1993). Fleming-Viot processes in population genetics. 
SIAM J. Control Optim. 31 345-386. MR1205982 

Ethier, S. N. and Kurtz, T. G. (1998). Coupling and ergodic theorems for Fleming-Viot 
processes. Ann. Probab. 26 533-561. MR1626158 

Ethier, S. N. and Shiga, T. (2000). A Fleming-Viot process with unbounded selection. 
J. Math. Kyoto Umv. 40 337-361. MR1787875 

Evans, S. (2000). Kingman's coalescent as a random metric space. In Stochastic Models: 
Proceedings of the International Conference on Stochastic Models in Honour of Pro- 
fessor Donald A. Dawson, Ottawa, Canada, June 10-13, 1998 (L. G. GOROSTIZA and 
B. G. Ivanoff, eds.). Canadian Math. Soc, Ottawa, ON. 

Evans, S. N. and Lidman, T. (2007). Asymptotic evolution of acyclic random mappings. 
Electron. J. Probab. 12 1151-1180 (electronic). MR2336603 

Evans, S. N., Pitman, J. and Winter, A. (2006). Rayleigh processes, real trees, and 
root growth with re-grafting. Probab. Theory Related Fields 134 81-126. MR2221786 

Evans, S. N. and Winter, A. (2006). Subtree prune and regraft: A reversible real tree- 
valued Markov process. Ann. Probab. 34 918-961. MR2243874 

Fearnhead, p. (2001). Perfect simulation from population genetic models with selection. 
Theoret. Population Biol. 59 263-279. 

Fearnhead, P. (2002). The common ancestor at a nonneutral locus. J. Appl. Probab. 39 
38-54. MR1895142 

Fleming, W. H. and Viot, M. (1978). Some measure-valued population processes. In 
Stochastic Analysis (Proc. Internat. Conf., Northwestern Univ., Evanston, III., 1978) 
97-108. Academic Press, New York. MR0517236 

FuKUSHlMA, M. and Stroock, D. (1986). Reversibility of solutions to martingale prob- 
lems. In Probability, Statistical Mechanics, and Number Theory. Adv. Math. Suppl. Stud. 
9 107-123. Academic Press, Orlando, FL. MR0875449 

Greven, A., Pfaffelhuber, P. and Winter, A. (2009). Convergence in distribution of 
random metric measure spaces (A-coalescent measure trees). Probab. Theory Related 
Fields 145 285-322. MR2520129 

Greven, A., Pfaffelhuber, P. and Winter, A. (2012). Tree- valued resampling dynam- 
ics (Martingale problems and applications). Probab. Theory Related Fields. To appear. 

Hamilton, W. D. (1964a). The genetical evolution of social behaviour. I. J. Theoret. 
Biol. 7 1-16. 

Hamilton, W. D. (1964b). The genetical evolution of social behaviour. II. J. Theoret. 
Biol. 7 17-52. 



56 



A. DEPPERSCHMIDT, A. GREVEN AND R PFAFFELHUBER 



Itatsu, S. (2002). Ergodic properties of Fleming- Viot processes with selection. Rep. Fac. 
Sci. Shizuoka Umv. 36 1-14. MR1952743 

Kaj, I. and Krone, S. M. (2003). The coalescent process in a population with stochas- 
tically varying size. J. Appl. Probab. 40 33-48. MR1953766 

Kallenberg, O. (2002). Foundations of Modern Probability, 2nd ed. Springer, New York. 
MR1876169 

Kaplan, N. L., Darden, T. and Hudson, R. R. (1988). The coalescent process in models 
with selection. Genetics 120 819-829. 

Kaplan, N. L., Hudson, R. R. and Langley, C. H. (1989). The "Hitchhiking effect" 
revisited. Genetics 123 887-899. 

Kingman, J. F. C. (1978). A simple model for the balance between selection and muta- 
tion. J. Appl. Probab. 15 1-12. MR0465272 

Kingman, J. F. C. (1982). The coalescent. Stochastic Process. Appl. 13 235-248. 
MR0671034 

Krone, S. M. and Neuhauser, C. (1997). Ancestral processes with selection. Theoret. 

Population Biol. 51 210-237. 
Liggett, T. M. (1985). Interacting Particle Systems. Grundlehren der Mathematischen 

Wissenschaften [Fundamental Principles of Mathematical Sciences] 276. Springer, New 

York. MR0776231 

Mano, S. (2009). Duality, ancestral and diffusion processes in models with selection. 
Theoret. Population Biol. 75 164-175. 

Mohle, M. and Sagitov, S. (2001). A classification of coalescent processes for haploid 
exchangeable population models. Ann. Probab. 29 1547-1562. MR1880231 

Neuhauser, C. and Krone, S. M. (1997). The genealogy of samples in models with 
selection. Genetics 154 519-534. 

Pfaffelhuber, p. and Wakolbinger, A. (2006). The process of most recent common an- 
cestors in an evolving coalescent. Stochastic Process. Appl. 116 1836-1859. MR2307061 

Revuz, D. and YOR, M. (1999). Continuous Martingales and Brownian Motion, 3rd ed. 
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathe- 
matical Sciences] 293. Springer, Berlin. MR1725357 

Slade, p. F. (2000a). Most recent common ancestor probability distributions in gene 
genealogies under selection. Theoret. Population Biol. 58 291-305. 

Slade, P. F. (2000b). Simulation of selected genealogies. Theoret. Population Biol. 57 
35-49. 

Tajima, F. (1983). Evolutionary relationship of DNA sequences in finite populations. 
Genetics 105 437-460. 

Taylor, J. E. (2007). The common ancestor process for a Wright-Fisher diffusion. Elec- 
tron. J. Probab. 12 808-847. MR2318411 
Uyenoyama, M. K., Feldman, M. W. and Mueller, L. D. (1981). Population genetic 

theory of kin selection: Multiple alleles at one locus. Proc. Natl. Acad. Sci. USA 78 

5036-5040. MR0627261 
Wakeley, J. and Sargsyan, O. (2009). The conditional ancestral selection graph with 

strong balancing selection. Theoret. Population Biol. 75 355-364. 
Watterson, G. a. (1975). On the number of segregating sites in genetical models without 

recombination. Theoret. Population Biol. 7 256-276. MR0366430 
Zambotti, L. (2001). A reflected stochastic heat equation as symmetric dynamics with 

respect to the 3-d Bessel bridge. J. Funct. Anal. 180 195-209. MR1814427 
Zambotti, L. (2002). Integration by parts on Bessel bridges and related stochastic partial 

differential equations. G. R. Math. Acad. Sci. Paris 334 209-212. MR1891060 



TREE- VALUED DYNAMICS WITH SELECTION 



57 



Zambotti, L. (2003). Integration by parts on iS-Bessel bridges, S >3 and related SPDEs. 
Ann. Probab. 31 323-348. MR1959795 



A. Depperschmidt 
P. Pfaffelhuber 

Abteilung fur mathematische Stochastik 
Albert-Ludwigs University of Freiburg 
eckerstr. 1 
79104 Freiburg 
Germany 

E-MAIL: depperschmidt(9stochastik.uni-freiburg.de 
p.pOstochastik.uni-freiburg.dc 

URL: 

http: / / www.stochastik.uni-freiburg.de/homepagcs /deppers / 
http: / /www. stochastik. uni-freiburg.de/homcpages/pfaffclh/ 



A. Greven 

Department Mathematik 
University of Erlangen 
Bismarckstr. fi 
91054 Erlangen 
Germany 

E-MAIL: grevenOmi . uni-erlangen. de 
URL: 

http: / /www. mi. uni-erlangen.de/~greven 



